This is the type of question that one would probably get in an exam, but now it happens to be in the real word. 🙂 I would appreciate some advice regarding the datastructure/algorithm that can be used to solve the following scenario.
A bit of background:
I have a controlling program that reads C# source code from a database, creates C# source files and then compiles the files into a dll. It then uses this dll to run calculations on other data obtained from the database, and writes the results of the calculations back to the database.
Let us call the controlling program, the Engine, and the source files, the Templates. The Engine compiles the Templates into a partial TemplateCalculations class so that each Template corresponds to a method of that class. The Templates contain code that will modify the Elements. The Elements are the objects on which the calculations will be performed.
But now, the actual problem:
Currently, the Engine uses a list in the database to see what the order should be in which the Templates should be executed. The Engine creates a Controller class which manages the calls to the methods in the correct order. The Templates contain many SetValue(elementName, elementValue) and GetValue(elementName) calls. Obviously, it shouldn’t try to get the element’s value, before it has already set the value.
Unfortunately, this list in the database that contains the template sequence is sometimes messed up by an ignorant person, and then the calculations don’t go through correctly.
The SetValue and GetValue calls for a particular element might or might not be in the same Template file (or method of TemplateCalculations once it is compiled).
Assume we have something like the following:
Template A: Engine.SetValue(“Element2”, Engine.GetValue(“Element1”))
Template B: Engine.SetValue(“Element1”,1)
then the present solution would have the list in the database specify that Template B must execute before Template A.
Also, the underlying datastructure for the elements and values during the execution of the calculations is a Dictionary.
I have the source files available in the Engine, as well as the dll that actually does the execution of the calculation. That is, I first start up the Engine, and then the Engine controls the calculations. There can be more than one SetValue for a particular element.
So, the question is, how would the Engine be able to figure out the order of the templates so that the list isn’t necessary anymore? That is, how can I determine an order for the template files, so that an element’s GetValue is never called before a SetValue for it?
A simpler solution would be to first determine that the order is incorrect, and to pinpoint which element (on which line of code) is causing the trouble. The more advanced solution would be to do away with the list entirely and let the Engine determine the optimal ordering.
I vaguely think that some kind of implementation of a tree might do the trick, but it’s very vague. 🙂
Let me know if you need more explanation. Any suggestions would be welcome.
If you can have multiple assignments (SetValue calls) to one Element, then it is just impossible to automatically generate the correct order of executing the Templates, for the simple reason that there are multiple possible orders that are equally valid.
Assuming you have
Template A: Engine.SetValue(“Element2”, Engine.GetValue(“Element1”))
Template B: Engine.SetValue(“Element1”,1)
Template C: Engine.SetValue(“Element1”,2)
Without repeating a template, there are 4 equally valid orderings and without knowing the intent of the calculations, it is impossible to tell which one is the correct order. If templates can be invoked repeatedly, the number of possibilities only increases.
If you are given an order o execute the templates in, then it is possible to determine if that order is valid (as in, all Elements were assigned before being used). This is the kind of analysis that compilers do to determine if a variable has been initialized before its first use.
If execution time doesn’t really matter (for example, if the calculations are executed automatically at night and you won’t see the results until the next morning anyway) and a partial run can’t harm anything, the easiest solution is to verify in GetValue
that the Element has been entered into the dictionary with a value and to throw an exception that aborts all calculations if it isn’t.
If you need to perform the verification up-front, then there are essentially two options:
- You parse the source code and you walk the parse tree in the order that the Templates will be executed. For each SetValue call you record which Element is being assigned and for each GetValue call you verify that the referenced Element has been assigned. This can be achieved by simply keeping a lookup table of assigned/existing Elements.
- You implement a ‘dry-run’ mode, in which you run the Templates, but prevent all externally visible changes from being propagated/committed. If all such changes are made through calls to the Engine anyway, this might be the easiest solution.
2