Coming back to SQL Server after several years away, I find the Visual Studio Database projects (and now SSDT) very exciting, as they are aiming to offer a way to develop databases declaratively, rather than having to mess around with change scripts.
However, the elephant in the room is the ‘data motions’ (I’m not too sure whether that’s an official term): it’s not too hard to automatically script a simple change in one table, but if you’re normalising/denormalising, things get hairy very quickly.
Gert Drapers said as much in this TechEd video, and that they’re trying to work out a way of working with these complex data migrations too.
The question: This got me thinking. How could you do this? Is there a way – even in theory – to define these data migrations declaratively? Or even a non-declarative way which could be combined elegantly with the declarative bits? Does anything like this already exist in other frameworks (I’ve vaguely heard something about Ruby/Rails?)
(I wondered whether to post this on dba, but I’m not sure whether they like ‘good subjective questions (if this is one), feel free to migrate if necessary)
6
After searching a little bit, I found this patent reference from 2002 for a “Declarative data transformation engine”:
http://www.google.com/patents/US20070220022
(Disclaimer: that kind of patents are the reason why I think the american software patent system is totally flawed and needs to be changed).
However, I don’t know if someone made a product based on this patent for the usage you described above. In theory, that should be possible, by performing those steps:
- create a new database schema as a copy of your existing one (version 1)
- apply schema transformations to transform it into “version 2”
- apply data migration from database version 1 into database version 2
- replace database version 1 by version 2
If this is a good idea to solve your “data movement” problem this way, depends a lot on your circumstances and your database system, since it involves making a whole copy instead of applying the migration in-place. Where it works, you can solve step 3 by any “Extract-Transform-Load” tool available on the market. AFAIK, tools like Microsoft SSIS or Oracle ODI seem to offer declarative definitions of transformation rules as well as non-declarative.
0
If you’re using Entity Framework Code first to manage your database schema, there is a migrations api (similar to Ruby on Rails’ migrations). Although you are using C#, you essentially “declare” the migrations in code. And the framework handles the mechanics for you. In many cases, just the act of creating a new migration snapshot is enough for the system to automatically discover and make the migration work for you. Here are some references that discuss how it works
Visual Studio Magazine Article
ADO.NET Team Blog
Well, things don’t look any better 10 years after the question was asked.
The obstacle to this kind of automatic transformation between table schemas, is that tables of data consist of many properties that aren’t currently described (let alone enforced) as part of the schema, and in some cases the change in schema may in fact force these properties to change (requiring high-level decisions by the programmer about how the data should properly change).
So even simple changes would therefore require the properties of the data to be described in considerable further detail, and some (perhaps all but the simplest) “motions” might require the programmer to explicitly specify their own decisions. The additional complexity of implementing and learning this technology could well be worse than the time taken to just write ad-hoc change scripts (which state exactly how the data should transform, but does not attempt to describe why that transformation method is appropriate).