The way I understand Fluent Domain Specific Languages I am able to use method chaining to have a conversation with the code. For example, if the business requirement is to call the database in order to “Get all customers with InActive accounts” I could use:
Customers().WithInActiveAccount()
Customers has millions of rows. Pulling all customers into memory is not efficient when I only need a subset of customers (and maybe not even possible given memory constraints). I suspect ORM’s solve this problem by treating code as data, lazy-loading and building a complete query based off the entire expression. So the final query may be
SELECT * CUSTOMERS WHERE InActive = true
IME, when dealing with highly normalized tables ORM’s produce inefficient DB queries. Rolling yet another custom ORM to solve such an issue feels like a death march waiting to happen. And stored procedures written by a DB professional are going to be efficient.
In this simple case I can simply change customers to an object:
Customers.WithInactiveAccount()
What if I need to do something more complex?
Customers.WithInactiveAccount().BornAfter(October 1, 1990)
How do I efficiently build up queries as I build more advanced expressions that potentially draw in other entities? This is a question I’m sure every ORM asks themselves right in the early stages of development. Do I have to limit myself to “dumb queries” to maintain performance? If this a technique that exists?
These are the types of questions I find myself getting from developers like me that have experienced across the board performance problems with ORM’s in the big data world.
So when dealing with these types of normalized Databases is a fluent DSL a practical option? (I’m assuming a fluent DSL for DB access requires an underlying ORM to function)
7
First, let’s clarify terms a little…
The term DSL is enormously wide. SQL, HTML, LOGO, Mathematica, are all DSLs. You are talking about referring querying your data model according to its actual structure in a strongly typed manner.
Fluent means method chaining so your source looks more like English and less like a programming language. like so: Noun().Adjective().Verb().Adverb()
. This is not the only or even the best way to form queries.
Big-Data usually refers to data that can not be efficiently stored and queried using RDBMS. This means Big-data and “normalized” are mostly mutually exclusive.
Now regarding your question. First of all I’m answering based on my experiance of several years you using C#, F#, some C++, and some Java, NHibernate, MS-SQL, PostgreSQL, and some MongoDB, and some Hadoop, mostly on pretty big data-sets.
-
“Fluent” is a bad idea. It’s usually harder to write, and tends to be misleading to the reader. it’s also a lot less “discoverable” you need to learn an entire vocabulary to use and understand a given “fluent” API.
-
Using an ORM (NHibernate, Hibernate, Entity Framework), is better than manipulating data by yourself. This is not always true, and you should always test, optimize, and understand what your ORM is doing and why. This involves a pretty significant learning curve, you need to understand your ORM, you need to understand how to create a correct mapping, and how to control the way queries are generated. On the other hand if you know what you are doing about ~98% of the time using an ORM is the fastest way to create the best and most performant solutions, with the least effort. ~2% of the time you end up going to the DBA, you write a stored procedure or some SQL, and you use it from within the ORM…
-
You should have a proper DAL layer, handling data manipulation. Using an ORM doesn’t remove the need to build DAL.
-
Writing queries and manipulating data in your programming language, in a strongly typed way is a great idea. It’s fast, verified by the compiler, and very convenient. C# has a special feature called LINQ that enables querying various data sources, these include: C#’s collections, XML, RDBMSes, ODATA sources, many other structured, non structured, real big-data (MongoDB, Cassandra(?), Hadoop), and ORMs such as NHibernate and Entity Framework. HibernateNHibernate also have a “Fluent” quarry language called Criteria, and a special non-strongly typed (strings) language called HQL. NHibernate’s Linq provider also has some limitations. Usually the strongly typed options are preferable, but still it’s very important to understand them thoroughly.
-
You seem to not “believe” in ORMs… I think this comes from not being familiar, and lacking experience of working with them. I assure you all of the questions you are asking have been considered, and addressed, by some of the best developers in the industry.
5