I have a database having around 15 tables.Currently I am working on a report generation page, which is having 16 criteria.The values are distributed among multiple tables.User can select any number of search criteria. The problem is, if the user is entering 2-3 values, it is better to filter first by certain value and then apply the next level of filtration on remaining values and this order of filtration depends on the user input always. That means, I have to check all the possible ways of input permutation and give execution path for each, which will be horrible.Is there any way by which I can reduce the execution paths. I am not able to come up with an efficient algorithm.
Suppose I have 3 parameters: name
, skill
, programExp
If name
is given, then I should filter based on name
first.
If skill
is given, then I should filter using skill
, programExp
and then name
.
5
I’m going to go on the assumption that the query results are the results of a JOIN
filtered using a WHERE
clause.
I have to check all the possible ways of input permutation and give
execution path for each, which will be horrible.
That will be horrible. It sounds like you’re trying to do something any decent database already does for you.
Good databases have a query optimizer whose job it is to examine incoming queries and come up with an execution plan for carrying it out. Plans are constructed to consume the fewest resources (CPU, memory, access to storage) while answering queries quickly and correctly. The major factors that go into these decisions are the tables used, JOIN
conditions, WHERE
clauses, available indexes and table statistics. For example, if the optimizer sees that one of your WHERE
clauses can cut the result set of a JOIN
from a billion rows to, say, a thousand, that condition will be applied earlier rather than later so fewer intermediate results have to be handled.
What this means for you is that you don’t have to worry about what order you apply the conditions. Stack them up in a WHERE
clause and the optimizer will take care of the rest. Once you’ve made it work and work right, you can worry about making it work fast if you’ve determined that your queries are a bottleneck. The cure will almost universally not be trying to outsmart the optimizer or handling the data yourself, it will be tuning your database so the optimizer has what it needs to do a better job. Databases with optimizers often have an EXPLAIN
or SHOW PLAN
feature that will let you see what the optimizer comes up with for a given query, and the parts that cause performance problems will stick out like a sore thumb.
Bottom line: build a query based on the conditions in force and let the database do its job.
You should not be coding an “Sql execution PathPlan decision tree”
You need to optimize your Query (singular) to perform as best it can with NO criteria. Then add a Dynamic where clause encompassing all the selected search criteria.
As already suggested in a comment, a search index is a better solution that database queries. I can second the Thorsten’s suggestion of apache Solr. You could probably get a working index up in an hour.
2