We have a data layer that wraps Linq To SQL. In this datalayer we have this method (simplified)
int InsertReport(Report report)
{
db.Reports.InsertOnSubmit(report);
db.SubmitChanges();
return report.ID;
}
On submit changes, the report ID is updated with the value in the database which we then return.
From the calling side it looks like this (simplified)
var report = new Report();
DataLayer.InsertReport(report);
// Do something with report.ID
Looking at the code, ID has been set inside the InsertReport function as a kind of side effect, and then we are ignoring the return value.
My question is, should I rely on the side effect and do something like this instead.
void InsertReport(Report report)
{
db.Reports.InsertOnSubmit(report);
db.SubmitChanges();
}
or should we prevent it
int InsertReport(Report report)
{
var newReport = report.Clone();
db.Reports.InsertOnSubmit(newReport);
db.SubmitChanges();
return newReport.ID;
}
maybe even
Report InsertReport(Report report)
{
var newReport = report.Clone();
db.Reports.InsertOnSubmit(newReport);
db.SubmitChanges();
return newReport;
}
This question was raised when we created a unit test and found that its not really clear that the report parameters ID property gets updated and that to mock the side effect behavior felt wrong, a code smell if you will.
1
Yes, it’s OK, and fairly common. It can be non-obvious though, as you’ve discovered.
In general, I tend to have persistence-type methods return the updated instance of the object. That is:
Report InsertReport(Report report)
{
db.Reports.InsertOnSubmit(report);
db.SubmitChanges();
return report;
}
Yes, you’re returning the same object as you passed in, but it makes the API clearer. There’s no need for the clone – if anything that will cause confusion if, as in the your original code, the caller continues to use the object they passed in.
Another option is to use a DTO
Report InsertReport(ReportDTO dto)
{
var newReport = Report.Create(dto);
db.Reports.InsertOnSubmit(newReport);
db.SubmitChanges();
return newReport;
}
That way the API is very obvious, and the caller can’t accidentally try and use the passed in/modified object. Depending on what your code is doing, it can be a bit of a pain though.
2
IMO this is a rare instance where the side effect of change is desirable – because your Report entity HAS an ID it could already be assumed to have the concerns of a DTO, and arguably there is the ORM obligation to ensure that the in-memory Report entity is kept in synch with the database representation object.
1
Usually programmers expect that only an object’s instance methods can change its state. In other words, I’m not surprised if report.insert()
changes the report’s id, and it’s easy to check. It’s not easy to wonder for every method in the whole application if it changes the report’s id or not.
I would also argue that perhaps ID
shouldn’t even belong to Report
at all. Since it doesn’t contain a valid id for so long, you really have two different objects before and after insertion, with different behavior. The “before” object can be inserted, but it can’t be retrieved, updated, or deleted. The “after” object is the exact opposite. One has an id and the other doesn’t. The way they are displayed may be different. The lists they appear in may be different. Associated user permissions may be different. They are both “reports” in the English sense of the word, but they are very different.
On the other hand, your code may be simple enough that one object will suffice, but it’s something to consider if your code is peppered with if (validId) {...} else {...}
.
The problem is that without documentation, it’s not clear at all what the method is doing and especially why is it returning an integer.
The easiest solution would be to use a different name for your method. Something like:
int GenerateIdAndInsert(Report report)
Still, this is not unambiguous enough: if, like in C#, the instance of the report
object is passed by reference, it would be difficult to know if the original report
object was modified, or if the method cloned it and modified only the clone. If you choose to modify the original object, it would be better to name the method:
void ChangeIdAndInsert(Report report)
A more complicated (and maybe less optimal) solution is to heavily refactor the code. What about:
using (var transaction = new TransactionScope())
{
var id = this.Data.GenerateReportId(); // We need to find an available ID...
this.Data.AddReportWithId(id, report); // ... and use this ID to insert a report.
transaction.Complete();
}
No it is not OK! It is suitable for a procedure to modify a parameter only in procedural languages, where there is no other way; in OOP languages call the changing method on the object, in this case on the report (something like report.generateNewId()).
In this case your method does 2 things, hence breaks SRP: it inserts a record in the db and it generates a new id. The caller is not able to know that your method also generates a new id since it is simply called insertRecord().
4