I’ve worked in some projects where most of the business logic was implemented on the database (mostly through stored procedures). On the other side, I’ve heard from some fellow programmers that this is a bad practice (“Databases are there to store data. Applications are there to do the rest”).
Which of these approaches is the generally better?
The pros of implementing business logic in the DB I can think of are:
- Centralization of business logic;
- Independency of application type, programming language, OS, etc;
- Databases are less prone to technology migration or big refactorings (AFAIK);
- No rework on application technology migration (e.g.: .NET to Java, Perl to Python, etc).
The cons:
- SQL is less productive and more complex for business logic programming, due to the lack of libraries and language constructs the most application-oriented languages offer;
- More difficult (if possible at all) code reuse through libraries;
- Less productive IDEs.
Note: The databases I’m talking about are relational, popular databases like SQL Server, Oracle, MySql etc.
Thanks!
9
Business logic doesn’t go into the database
If we’re talking about multi-tier applications, it seems pretty clear that business logic, the kind of intelligence that runs a particular enterprise, belongs in the Business Logic Layer, not in the Data Access Layer.
Databases do a few things really well:
- They store and retrieve data
- They establish and enforce relationships between different data entities
- They provide the means to query the data for answers
- They provide performance optimizations.
- They provide access control
Now, of course, you can codify all sorts of things in a database that pertain to your business concerns, things like tax rates, discounts, operation codes, categories and so forth. But the business action that is taken on that data is not generally coded into the database, for all sorts of reasons already mentioned by others, although an action can be chosen in the database and executed elsewhere.
And of course, there may be things that are performed in a database for performance and other reasons:
- Closing out an accounting period
- Number crunching
- Nightly batch processes
- Fail-over
Naturally, nothing is engraved in stone. Stored Procedures are suitable for a wide array of tasks simply because they live on the database server and have certain strengths and advantages.
Stored Procedures Everywhere?
There’s a certain allure to coding all of your data storage, management and retrieval tasks in stored procedures, and simply consuming the resulting data services. You certainly would benefit from the maximum possible performance and security optimizations that the database server could provide, and that’s no small thing.
But what do you risk?
- Vendor lock-in
- The need for developers with special skill sets
- Spartan programming tools, overall
- Extremely tight software coupling
- No separation of concerns
And of course, if you need a web service (which is probably where this is all heading, anyway), you’re still going to have to build that.
So what is typical practice?
I would say that a typical, modern approach is to use an Object-Relational Mapper (such as Entity Framework) to create classes that model your tables. You can then speak to your database through a repository that returns collections of objects, a situation that is very familiar to any competent software developer. The ORM dynamically generates SQL corresponding to your data model and the information requested, which the database server then processes to return query results.
How well does this work? Very well, and much more rapidly than writing stored procedures and views. This generally covers about 80% of your data access requirements, mostly CRUD. What covers the other 20%? You guessed it: stored procedures, which all of the major ORMs support directly.
Can you write a code generator that does the same thing as an ORM, but with stored procedures? Sure you can. But ORMs are generally vendor-independent, well-understood by everyone, and better supported.
19
I am a strong believer in keeping business logic out of the database as much as possible. However, as my company’s performance developer, I appreciate that sometimes it’s necessary to achieve good performance. But I think it is necessary far less often than people claim.
I dispute your pros and cons.
You claim that it centralizes your business logic. On the contrary, I think it decentralizes it. In a product that I currently work on, we use stored procedure for a lot of our business logic. Many of our performance issues come from calling functions repeatedly. For instance
select <whatever>
from group g
where fn_invoker_has_access_to_group(g.group_id)
Problem with this approach is that it generally (there may be cases where this is false) forces the database to run your function N times, once per row. Sometimes that function is expensive. Some databases support function indexes. But you can’t index every possible function against every possible input. Or can you?
A common solution to the above problem is to extract the logic from the function and merge it into the query. Now you have broken encapsulation and duplicated logic.
Another issue I see is calling stored procedures in a loop because there is no way to join or intersect stored proc result sets.
declare some_cursor
while some_cursor has rows
exec some_other_proc
end
If you pull the code from the nested proc out, then you again decentralize. Therefore, you are forced to choose between encapsulation and performance.
In general, I find that databases are bad at:
- Computation
- Iteration (they are optimized for set operations)
- Load balancing
- Parsing
Databases are good at:
- Locking and unlocking
- Maintaining data and their relationships
- Ensuring integrity
By taking expensive operations like loops and string parsing and keeping them in your app tier, you can horizontally scale your application to get better performance. Adding multiple app servers behind a load balancer is usually far cheaper than setting up database replication.
You are correct, however, that it decouples your business logic from your application’s programming language, but I don’t see why that is an advantage. If you have a Java app, then you have a Java app. Converting a bunch of Java code into stored procedures doesn’t change the fact that you have a Java app.
My preference is to keep database code focused on persistence. How do you create a new widget? You must insert into 3 tables and they must be in a transaction. That belongs in a stored procedure.
Defining what can be done to a widget and the business rules for finding widgets belongs in your application.
5
I have worked in 2 different companies that had different vision on the subject.
My personal suggestion would be to use Stored Procedures when execution time is important (performance). Since Stored Procedure are compiled, if you have a complex logic to query the data, it’s better to keep that on the database itself. Also, it will only send the final data to your program at the end.
Otherwise, I think the logic of a program should always be in the software itself. Why? Because a program needs to be testable and I don’t think there is an easy way to unit test stored procedure. Don’t forget, a program that is not tested is a bad program.
So use Stored Procedure with caution, when it’s needed.
10
There’s a middle ground that you need to find. I’ve seen scary projects where the programmers use the database as nothing more than an overpriced key/value store. I’ve seen others where the programmers fail to use foreign keys & indices. On the other end of the spectrum, I’ve seen projects where most if not all of the business logic is implemented in database code.
As you’ve noted, T-SQL (or its equivalent in other popular RDBMSs) is not exactly the best place to be coding complex business logic.
I try to build a reasonably decent data model, use features of the database to protect my assumptions about that model (i.e., FKs and constraints), and use database code sparingly. The database code is useful when you need something (i.e., a sum) that the database is very good at doing and can spare you from moving a zillion records over the wire when you don’t need them.
6
If your business logic involves set operations, most likely a good place for it is in the database because database systems are really good at performing set operations.
http://en.wikipedia.org/wiki/Set_operations_(SQL)
If the business logic involves some sort of calculation it probably belongs outside of the database/store procedure since databases are not really designed for looping and calculating.
Although these are not hard and fast rules, its a good starting point.
0
There is no one right answer to this. It depends on what you use the database for. In an enterprise application, you need the logic in the database through foreign keys, constraints, triggers, etc. because it is the only place where all possible applications share code. Further, putting the required logic in code generally means the database is inconsistent and the data is of poor quality. That may seem trivial to an application devloper who is only concerend with how the GUI works, but I assure you that the people trying to use the data in compliance reports find it very annoying and costly when they get billion dollar fines for having data that didn’t follow the rules correctly.
In a non-regulatory environment when you don’t care as much about the whole set of records and only one or two applications hit the database, maybe you can get away with keeping it all in the application.
After a few years, the question is still important…
Simple rule-of-thumb for me: if it’s a logical constraint or an ubiquitous expression (single statement), place it in the database (yes, foreign keys and check constraints are business logic, too!). If it’s procedural, by containing loops and conditional branches (and really can’t be changed into an expression), put it in code.
Avoid trash dump DBs
Attempts to place really all business logic in application code will likely degenerate the (relational) database into a trash dump, where relational design is mostly to completely omitted, where data can have any inconsistent state, and normalization is missing (often mainly XML, JSON, CSV etc. trashbin columns).
This kind of application-only logic is probably one of the main reasons for the rise of NoSQL – of course with the downside that the application has to take care of all the logic itself, what has been built into relational DB for decades. However, NoSQL databases are more suitable for this kind of data handling, for example, data documents maintain an implicit “relational integrity” within themselves. For relational DBs, it’s simply abuse, causing ever more trouble.
Expressions (set-based) instead of procedural code
In best case, every data query or operation should be coded as an expression, rather than procedural code. A great support for this is when programming languages support expressions, such as LINQ in the .NET world (unfortunately, only queries currently, no manipulation). On the relational DB side, it has been taught for a long time, to prefer SQL statement expressions over procedural cursor loops. So the DB can optimize, do the operation parallel, or whatever may be useful.
Utilize DB data integrity mechanisms
When it comes to RDBMS with Foreign Key and Check constraints, calculated columns, possibly triggers and views, this is the place to store the basic business logic in the database. Proper normalization helps to maintain data integrity, to ensure a unique and distinct instance of the data. Even if you have to duplicate it in code and DB, these basic mechanisms of data integrity should not be omitted!
Stored Procedures?
Stored Procedures are rarely necessary nowadays, since databases keep compiled execution plans for SQL and reuse them when the same query comes again, only with different paramters. So the precompile argument for SPs is no longer valid. One can store or auto-generate SQL queries in the application or ORM, which will find precompiled query plans most of the time. SQL is an expression language, as long as you don’t explicitly use procedural elements. So, in best case, you use code expressions which can be translated into SQL.
While application side, including ORM generated, SQL, is no longer inside the database, unlike Stored Procedures, I still count it as database code. Because it still requires SQL and database knowledge (except the most simple CRUD), and, if applied properly, works greatly different than the procedural code usually created with programming languages like C# or Java.
1
It realy depends on the business, its culture and legacy. Technical considerations aside (these have been covered from both sides), the answers given tells you that it comes down to where people are coming from. In some organizations, data is king and the DBA is a powerful figure. This is your typical centralized environment, a data center with a bunch of terminals attached to it. The preferrence in this type of environment is clear. The desktop may change radically many times before anything changes in the data center and there will be little in between.
The other end of the spectrum is the pure 3-tier architecture. Or maybe multi-tier in a web oriented business. Your will likely hear a different story here. The DBA, if there is any, will be just a sidekick that performs some administrative tasks.
A modern times application developer will have more affinity with the second model. If you grew up with a large client-server system you would likely be in the other camp.
There are often so many non-technical environment related factors involved here, there is no general answer to this question.
The term business logic is open to interpretation. When building systems, we want to ensure the integrity of the database and its contents. As a first step there should be different user access grants in place. As a very simple example, let us consider an ATM application.
To get the account balance, doing a select on an appropriate view should fine. But to transfer funds, you would want the transaction to be encapsulated by a stored procedure. The business logic should not be permitted to directly update the tables for the credit and debit amounts.
In this example, the business logic could check the balance before requesting the transfer or simply invoke the stored proc for the transfer and report the failure. IMHO, the business logic, in this example, should pre-emptively check that sufficient funds are available and that the target account exists and only then invoke the transfer funds. If another debit happens between the initial steps and the stored proc invocation, only then would an error be returned.
1
Usually, what I tried to do was to construct … “a constellation of objects.” Which I would refer to as “‘Things’ with a capital-T.”
A “Thing™” corresponded to something that really existed in the business or at least in the application, and among its many other magic tricks it knew how to persist itself in the database. Sometimes it used an ORM layer for that; at other times it used hand-rolled queries.
If we needed “a collection of Things,” well, “that was just another Thing,” which served as a container.
The underlying database structure still closely followed that of the Things, so that we could exploit the capabilities of the DBMS as much as possible, but only the Things ever used the database directly … at least as a “backing store” but sometimes in other, active ways. The rest of the application dealt only with Things, and never knew (nor cared …) how they actually worked.
1