I am working with an existing application built on Rails 3.1/MySql with much of the work taking place in a JavaScript interface, although the actual platforms are not tremendously relevant here, except in that they give context.
The application is powerful, handles a reasonable amount of data and works well. As the number of customers using it and the complexity of the projects they create increases, however, we are starting to run into a few performance problems. As far as I can tell, the source of these problems is that the data represents a tree and it is very hard for ActiveRecord to deterministically know what data it should be retrieving.
My model has many relationships like this:
Project
has_many Nodes
has_many GlobalConditions
Node
has_one Parent
has_many Nodes
has_many WeightingFactors through NodeFactors
has_many Tags through NodeTags
GlobalCondition
has_many Nodes ( referenced by Id, rather than replicating tree )
WeightingFactor
has_many Nodes through NodeFactors
Tag
has_many Nodes through NodeTags
The whole system has something in the region of thirty types which optionally hang off one or many nodes in the tree.
My question is: What can I do to retrieve and construct this data faster?
Having worked a lot with .Net, if I was in a similar situation there, I would look at building up a Stored Procedure to pull everything out of the database in one go but I would prefer to keep my logic in the application and from what I can tell it would be hard to take the queried data and build ActiveRecord objects from it without losing their integrity, which would cause more problems than it solves.
It has also occurred to me that I could bunch the data up and send some of it across asynchronously, which would not improve performance but would improve the user perception of performance. However if sections of the data appeared after page load that could also be quite confusing.
I am wondering whether it would be a useful strategy to make everything aware of it’s parent project, so that one could pull all the records for that project and then build up the relationships later, but given the ubiquity of complex trees in day to day programming life I wouldn’t be surprised if there were some better design patterns or standard approaches to this type of situation that I am not well versed in.
7
whether it would be a useful strategy to make everything aware of it’s parent project
That is a standard strategy. It is called “denormalization”. Assumed
- your “project elements” never change the project they belong to
- your project’s primary ID key will never change
- the additional memory needed for the IDs is no problem
- it actually solves your performance problems
you should apply it.
Another approach is trying to add some views to your database, encapsulating some complex JOIN operations (I assume your database is properly indexed so the JOIN itself won’t slow down things more than necessary). I never used Rails by mayself, but AFAIK ActiveRecord will work on views. So for example by joining 2 tables of a parent-child relationship together, instead of having a first SELECT SQL to determine a set of parent objects, and a list of SELECT SQLs, each one for the set of child objects of a specific parent ID, you pull all your data from the joined view with just one select SQL. Of course, this will work best when you need the data in joined form and don’t want to write changes back to your DB system.
1