As mentioned in this question, I am moving our team towards objects (as opposed to just throwing DataTables and variables around everywhere). I have picked a suitable spot for the project that contains the object definitions, but not sure how I should go about populating some of the objects. In particular, what do I do with classes as properties?
Public Class Class1
Public Property SomeProperty As String
Public Property SomeOtherProperty As String
Public Property SomeKey As Integer
Public Property Foo As Class2
End Class
In this basic example, I can easily populate the first three properties from a query, but Foo
has its own set of properties. Right now, to create a List(Of Class1)
I generate the data for all of the Class1 objects, and then use SomeKey
to go back and, one object at a time, populate all of the Foo
. Certainly this works, but in my case it is rather slow, and it seems that perhaps I am simply doing it wrong.
Am I handling that incorrectly? If so, what should I be doing? The only other thing I can come up with is to create a query (or stored procedure) that returns all of the appropriate data for all of the objects. In my test case that amounts to a list that contains 154 Class1
‘s, with each of them having four properties like Foo
, and some of those having classes as properties.
1
Your current approach causes what is called as N+1 Select problem (see https://stackoverflow.com/questions/97197/what-is-the-n1-selects-issue).
One way to solve this is to retrieve all required data in the query by using select distinct and left join. This is called eager fetching. In most cases, it is the preferred solution. If you have a deep hierarchy, you may want to create separate finder methods in your repository class such as findStudent()
will only populate Student
and findStudentForReview()
will return Student.subjects.assignments
.
Another way to solve this is to create a getter for property Foo
that will retrieve data for Class2
if the data hasn’t been retrieved before. This is called lazy fetching approach: query for Class2
will be executed only when you need Foo
value.
Or, you could use a ORM framework for your platform. You will need extra investment here. The advantage is ORM framework will handle data population and provide you with ready-to-use objects.
There is no “best practice” on how to “populate objects”, the same way as there is no best practice for how to create /populate a string or integer somewhere in your program. Objects are instances of classes, which can be seen as the equivalence of a user-defined type. You create and use some of them within a small block scope, some of them a function scope, some of them in a class scope, module scope or global scope, just as long as you need the individual objects instance in your program to solve a certain task. Which task that is, and which lifetime for an object is appropriate, depends fully on how the class you created, its responsibilities and the usage scenario.
To work with entities, you really should be using either Linq-to-SQL, or Entity Framework. There other ORMs but few are really worth it, they have nothing real to offer over EF6. To do this, you first create a model (or multiple models) of your datasources, and then use that model to query the data source through Linq queries. The ORM acts as an abstraction layer between the data provider and your language of choice. For example, in Linq to SQL, your example object would be queried like so:
' First you instanciate your context. This is the virtual connection to the database. You
' are actually connecting to your data model here, the connection to the database is only
' made when there is actually a query executing.
Using db As New MyDataModel
' Then you define your query. Linq queries work in defered execution. They
' are only executed when they are actually used. The type returned is an
' IEnumerable(Of Class1)
Dim c1s = From c1 In db.Class1s
Where c1.SomeProperty = someValue
Select c1
' The objects retrieved maintain their relational model as well, so accessing
' the Foo property is as simple as using the navigation property provided by
' the ORM. All of the relationships are taken care for you in the background.
For Each c1 In c1s
Console.WriteLine(c1.Foo.SomeProperty)
Next
' Navigation properties are essentially joins, links implemented by the relationships
' defined in your object relational model. The following example would be this SQL:
' SELECT c2.* FROM Class1s c1 INNER JOIN Class2s c2 ON c1.SomeKey = c2.SomeForeignKey
Dim c2s = From c1 In c1s Select c1.Foo
End Using
In short you would NEVER populate your object like you were describing, except in cases where you need a flattened output from a query, but even then you would use anonymous types.
' You can also select specific elements from a relationship through anonymous types
' Here we just define the output type as we go. This is the equivalent to selecting
' individual fields in a SQL select.
Dim anonTypeObj = (From c1 In db.Class1s
Select New With {.c1SomeProperty = c1.SomeProperty,
.c2SomeOtherProperty = c1.Foo.SomeOtherProperty}).First
Console.WriteLine(anonTypeObj.c1SomeProperty & " : " & anonTypeObj.c2SomeOtherProperty)
7