Say I have a basic example of a data tree to save in the database:
class TreeModel {
int id;
virtual Collection ParentNodes;
}
class ParentNode {
int id;
virtual Collection TreeModel;
virtual Collection ChildNode;
}
class ChildNode {
int id;
virtual ParentNode Parent;
}
When creating this tree the sequence is like this, spread over multiple forms:
// Add Model button
tree = new TreeModel;
dbContext.Models.Add(tree); // changetracking in logs
// Add Node button:
tree.ParentNodes.Add(new ParentNode());
new ParentForm(tree.ParentNodes.First());
// In ParentForm:
ParentNode.Add(new ChildNode());
This works as expected, but the navigation properties I have not called Add()
on remain null
. It only resolved them when SaveChanges is called on context. It then also sets id
.
I would expect, since the TreeModel
is tracked, it would detect changes to the nav properties and resolve them accordingly.
This is a problem when I want to call up a Form(ParentNode)
and want to display some properties from TreeModel
(eg: name) from within that form, or duplicate a ChildNode
from another ParentNode
during creation wizards.
I cannot traverse the tree because TreeModel
is never set in ParentNode
.
All in a minefield of avoiding nulls that feels just wrong.
Workaround could be to manually set these properties, but that feels like going against what EF is for.
Another way would be to pass around the dbContext everywhere in the app and call SaveContext on each step, but that is also recommended against in the docs.
So, I must be doing something wrong here. What is the correct creation workflow for entities?
4
The first detail to avoid #null issues with navigation properties is to ensure that collection navigation properties are initialized. For instance:
public class TreeModel
{
public int id;
public virtual ICollection<Node> ParentNodes { get; } = [];
}
This ensures the “ParentNodes” is good to go when creating a new Tree. Note that there is no setter accessible. When loading an entity, whether we eager load the ParentNodes or not, we never want to re-initialize the navigation property. For instance, a common mistake in EF when editing a Tree and wanting to update the collection to a modified set would be:
tree.ParentNodes = updated.ParentNodes;
… or something similar. This breaks change tracking for the collection and will lead to exceptions or duplicate/incorrect data.
When inserting data, new entities aren’t considered tracked until they are saved, and related entities will not get their PKs/FKs set until saved by default where EF is configured, or using convention to deal with identity PK columns. When updating rows it is important to eager load navigation properties with .Include()
if you intend to modify them.
An important detail when working with EF is distinguishing between a possessive relationship and an associative one. Possessive relationships are ones where when working with related entities and you insert one entity, you are also inserting rows for the related entity. For example if I have an Order and OrderLine, when I create a new Order that will have an OrderLine under it, I also create the new OrderLine and add it to Order.OrderLines. EF will insert the two rows and associate the FK, generating PKs.
var order = new Order();
foreach(var newOrderLine in newOrder.OrderLines)
{
var orderLine = new OrderLine
{
ProductId = newOrderLine.ProductId,
Quantity = newOrderLine.Quantity
//...
};
order.OrderLines.Add(orderLine);
}
Associative relationships are situations where when creating or updating an entity you want to add an association to another existing row. In the above example when I create the new OrderLine(s) they are associated to a Product. I had specified the ProductId FK based on the product the front-end selected, but if I have a Product navigation property and want to reference that, such as populate a view model to render a summary page, EF does not automatically load details from the product navigation property, even after saving unless that product happened to have been fetched earlier and tracked by the DbContext
. Even if we have the details of the product we don’t want to do something like:
var orderLine = new OrderLine
{
Product = new Product
{
ProductId = newOrderLine.Product.ProductId,
Name = newOrderLine.Product.Name,
}
Quantity = newOrderLine.Quantity
//...
};
The reason is that adding a new Product, even though we specify a ProductID, will be treated as attempting to insert a new product in the DB. Instead, with associative relationships you need to ensure you are associating tracked entities by setting the navigation references:
```cs
var order = new Order();
var productIds = newOrder.OrderLines.Select(ol => ol.ProductId).ToList();
var products = await _context.Products
.Where(p => productIds.Contains(p.ProductId))
.ToListAsync();
foreach(var newOrderLine in newOrder.OrderLines)
{
var orderLine = new OrderLine
{
Product = products.First(p => p.ProductId == newOrderLine.ProductId),
Quantity = newOrderLine.Quantity
//...
};
order.OrderLines.Add(orderLine);
}
In this case we get the list of product IDs that we need to associate and fetch them in a single call. When we go to create each OrderLine, we associate the tracked product reference. This ensures that EF associates the new order line rows to the existing Products. Alternatively if you have the data to populate an associative entity and don’t want to fetch it from the DB you can create a new instance and Attach()
it to the DbContext. However, before doing this you should still check the DbContext isn’t already tracking that row or the Attach()
call will fail.
// .Local checks local tracking cache, does not hit database.
var existingProduct = _context.Products.Local.FirstOrDefault(p => p.ProductId == newOrderLine.Product.ProductId);
if (existingProduct == null)
{
existingProduct = new Product
{
ProductId = newOrderLine.Product.ProductId,
Name = newOrderLine.Product.Name,
// ...
}
_context.Attach(existingProduct);
}
orderLine.Product = existingProduct;
// ...
Honestly going through the work to attach entities is rarely worth the simple call to fetch rows by ID, which is fast and also serves as a guard to ensure only existing, valid data is passed.