Note :
- I know we have
Django ORM
already that keeps things database independent and converts to the database specificSQL
queries. - Once things starts getting complicated it is preferred to write
raw SQL
queries for better efficiency. - When you write
raw sql
queries your code gets trapped with the database you are using. - I also understand its important to use the full power of your database that can-not be achieved with the
django orm
alone.
My Question :
- Until I use any database specific feature, why should one be trapped with the database.
- For instance :
We have a query with multiple joins and we decided to write a raw sql
query. Now, that makes my websitepostgres
specific. Even when I
have not used any postgres specific feature.
I feel there should be some fake sql
language which can translate to any database’s sql query. Even Django’s ORM can be built over it. So, that if you go out of ORM but not database specific – you can still remain database independent.
I asked the same question to Jacob Kaplan Moss
(In person) :
- He advised me to stay with the database that I like and endure its whole power, to which I agree. But my point was not that we should be
database independent
. - My point is we should be database independent until we use a database specific feature.
Please explain, why should be there a fake sql
layer over the actual sql ?
================================
Update:
================================
My suggestion:
Solve the independence of the database on a fake sql
level and then build ORM over that fake sql. So, if I have to write a sql query – I will use fake sql
that would work on all databases yet raw SQL. That way, I will remain database independent unless I use a feature very specific to my database.
Therefore, a query like select * from table limit 10;
will work on both postgres
and ‘MS-SQL Server’.
Is that sensible ?
Note:
- I am not talking about
ANSI SQL
, I am suggesting afake sql
to make all databases work with 1 standard sql. [ Which may not cover db specific features ]
6
I think you’re looking for the SQL ANSI standard.
This standard is implemented in most RDBMS for the biggest part.
However, every RDBMS decided to do some things their way. (Simply because the standard didn’t have the feature or another reason. LIMIT
has an equivalent since SQL:2008 only for example.)
Listing these differences from the SQL standard would take a whole webpage to do it. Oh wait, it was done!
So to answer you, for this very reason, you can’t be independant from RDBMS without losing performance (i.e. building a layer). The differences in the SQL standard implementations are what brought people to build ORMs.
5
The YAGNI principle is important in this case.
When deciding on the topic, you should consider the advantages and disadvantages of being tied to a specific database.
It’s not like there are 1000s of different databases out there. For relational databases, popular choices are Postgres and Mysql. And their differences and distinct characteristics are well known, so it’s easy to compare them beforehand for the problem you are going to solve.
If you are in control of the production environment, study the differences between them and choose one.
If you are not in the control of the production environment, like if you are going to distribute the software to end users, and they will install them on their own platforms, then you should aim for the database independence.
Unless you really have a compelling reason to be database independent, it will be a hassle and bite you in the long term if you aim to do so. General advice for this kind of situations is don’t do things just for the sake of doing it.
2
Python database API use a common convention, that is pretty much the same across different databases (but not quite!). You can read the MySQLdb documentation here.
There is also a more feature-rich interface to mysql, called oursql. It has real parametrization (not just glorified string interpolation), server-side cursors, data streaming and so on.
1