I am seeking the right model for storing and retrieving data, when working with any specific class, while keeping in mind the bigger picture.
Details:
I have some SQL code in (one) of my classes and I would like to use Separation of Concerns (SoC) to remove dealing with direct-database-specific code in that class.
Simplest thing will be to create DBCodeForOneSpecificClass
class, but thinking globally, I may end up with too many classes – I think creating a DB handling class for each Business logic is wasteful.
So I need to find a way to consolidate those DB-access functions properly.
I recall there was an advice in a book on refactoring SQL specifically and it was “pull all SQL code in your entire codebase and consolidate it in a single space, away from all other non-direct-SQL-dealing classes”. I do not recall specifics. I presume all SQL-code was to go into its own separate folder, where perhaps there was some kind of strategy or philosophy to keep a happy medium
between the extremes of
- create a single separate db-handling class for all classes in your entire codebase
- create a single separate db-handling class for each
BusinessLogic
class
My question is — can you help fill some specifics into the space between the extremes — what are some heuristics / models / ways I could use to make a decision on how to structure my DB-handling classes, so that I pretty much immediately know what to do, when handling a question of “Where do I place my db-handling code I currently find myself writing in this BusinessLogicClass
?”
Because right now I seem to solve this question on a case-by-case basis, either having SQL right inside the class, or in a separate class made just for one class, or grouping SQL by an aspect. In the end, ending up with various models mixed up, and that is not good. I am thinking that maybe may need help defining what an aspect is, or identifying clearer boundaries of what my domain objects are, which perhaps means identifying larger moving parts of the codebase, and then structuring my DB-interaction classes around those parts.
My satisfaction with the answers will depend on whether I am able to get closer to answering question of “Where do I put my d–n SQL while writing this specific new feature in this specif class…?”, while supporting SoC principle, and without being wasteful (too specific) or creating a big ball of mud (too general).
1
In general, there are several ways to handle CRUD-operations for db-persisted entities.
Active Record vs Data Mapper
An Active Record is the object-representation of a database row that contains domain logic and persistence logic. It’s the simplest form – and to avoid duplicated, boilerplate code you can use abstract base classes, traits/mixins or modular composition to define basic db-interactions in one place and have every active record use the methods provided. This is the wikipedia-entry on the active record pattern.
This pattern comes in handy because it’s quick and straightforward to implement. But it has some drawbacks. As you noted, it’s note separating concerns as well as alternatives and therefore is somewhat harder to reason about. When dealing only with basic objects for table-rows, containing only scalar values, and when done efficiently – the increase in conceptual complexity can be negligible.
This is one huge drawback, though – most interesting business-logic involves composite objects, for example sales documents with a header, lines and relational constraints (foreign keys). Unless you employ database-technology that can simply map both non-scalar values and relations, that makes relying on the active record-pattern for persistence-logic problematic… and even with such technology it has potential to be cumbersome.
If you are working with a relational database constrained to scalar values (for relevant intents and purposes) in the persistence of your domain objects, you need *some form of ORM – Object-Relational-Mapping. This can be achieved with the easy-to-use, easy-to-implement active record-pattern. In the PHP-world, Laravel‘s Eloquent ORM-tool does just that.
The main contender is the Data Mapper-pattern (Wiki), where mapping is handled by separate classes, like Repositories that can return individual objects, collections, or even composite objects – through repositories using other repositories, or by specifying operations, including joins on the basic stored data in the composition of objects in a single repository. This pattern is used by the doctrine ORM-tool (PHP, again).
The key to making this idea work elegantly and with minimal repeated effort is to maintain meta-data about the object and its relation to columns and rows in database tables explicit – including keys, field types and mapping to object properties. It’s possible to automate some parts or even the entirety of the production of code for such meta-data containing entities in database procedures or a specific application object, for example by using SHOW CREATE TABLE
results – and/or reflection on classes.
But it’s also possible to build your own, domain-specific, lightweight version of such an ORM, too. I have implement the data-mapper pattern in one of my own projects. Using SHOW CREATE TABLE
, results, I can quickly construct meta-data classes for every domain-class. Using abstract base classes or traits (in fact, I wrote two versions), I can quickly implement a repository for every domain-class that uses the meta-data class, implements persistence, relational-access and the identity map pattern in virtue of keeping a cache of fetched db-rows and being a single instance used in the construction of all classes needing db-access for the respective domain-object. In turn, all repositories are constructed with a basic wrapper on the PDO-Class, that allows for easy logging and implementation of the observer pattern (for example to keep dependent foreign entities current).
Though the initial effort for setup and configuration is certainly higher than with active records, the increased separation of concerns and flexibility is mostly worth it when your project exceeds a certain level of complexity.
A related concern – the Unit of Work:
In many cases, some interactions will affect multiple entities, including domain-entities with regard to their persisted data. Often, such access has to have transactional integrity – meaning that while a modification-cascade is unfinished, modifying operations on the involved entities must not be permitted. A way to handle this – and an important tool in any case, is the Unit of Work-pattern. On a basic level, the Unit of Work provides a way to keep track of which entities are ‘clean’ or ‘dirty’ and which are scheduled for insert, update, delete, modify.
Additional sources:
- MSDN – Peristence Patterns
- Sitepoint – Implementing the Unit of Work pattern
- Catalog of Patterns of Enterprise Application Architecture
- doctrine
- Eloquent
You could use the Template Method Pattern. Here is how.
Have an abstract class that will be the parent and define the way things are being loaded from the database.
Loading method will be an abstract method returning a generic value.
Connection string will be an abstract property.
Once you extend other classes from this parent class you can then define their own connection string as needed and implement the method for loading by returning their own specific value through the use of generics.
If there is a similarity in all of the subclasses on how things are being loaded from the database just push it up to the parent class.
If there are no similarities then separating them was a good idea in the first place because you are separating concerns and not merging them.
This also allows you to keep your codebase organized when other developers contributing since they would need to extend the parent class which has the loading template to be able to write their own loading implementation.
It also decouples your repository from the rest of the codebase because the rest of your codebase doesn’t have a clue as to which specific subclass is implementing the parent class that knows how to load stuff.
1