Our company implemented a calendar system a few months back with recurring appointments, using iCal strings to store the recurring appointment criteria.
We are starting to run into performance problems because of these.
Our current thinking is something along these lines…
A user requests all appointments for a date range. Typically no more than 30 days.
We grab all non-recurring events from the database that fall in this date range.
Then we grab ALL events that contain an iCal string and run each one through a processor to determine if the event happens to repeat in our requested date range. For example, the user may have the event starting on 1/1/14, repeating every Tuesday and Wednesday ongoing forever. Another event, however, may begin on 1/1/14, repeating every Tuesday and Wednesday for one month and is now outside of our selected range. The first one would repeat to our date range and be included. The second one would not.
As far as I can tell, there really isn’t any good way to determine any kind of filtering so that I don’t have to pull every single one of these events from the database and test them to see if they match our date range.
As the number of recurring events grows, our database is being slogged down with requests that return more and more records, and our server needs to process each of these, most of which aren’t necessary, but we don’t know until we get them to the server and process them.
Has anybody faced this before, and is there a better algorithm that we can implement to limit the number of records returned?
4
In one of Tom Kyte’s books he described a company with this exact problem. They did what you did and the performance was terrible. His recommendation was to store each instance of a recurring appointment in the database up to some point in the future. This solved all their performance problems with appointment searches.
Yes, there may be hundreds of occurrences of some appointments. So what? You only need to store the start times, and that’s only a few kilobytes. And you don’t need to store them in the indefinite future. Two or three years is plenty. When is the last time you wanted to know what appointments you had on some date in 2017?
1
It seems to me that your problem is in your first sentence:
Our company implemented a calendar system a few months back with recurring appointments, using iCal strings to store the recurring appointment criteria.
Having to scan all records to determine date ranges is simply not scalable. In essence, you’re using your database as a flat file, so of course it is slow.
Instead of storing the raw iCal strings, interpret the strings before you put them in the database so that you can run queries like
select * from records
where startdate < "2014-01-30 00:00:00" and enddate > "2014-01-01 00:00:00";
If you really need the iCal string, you can leave that in the DB as well, but things like appointment start/end dates should be db columns so you can efficiently search on them.
4
I would like to offer you my solution for this by recognizing that:
- All appointments, single or re-occurring, have at least a date
- Under reasonable settings, re-occurring appointments often have a pattern
In my solution, you will have for each appointment a start date and an end date. In the case of single appointment, you can leave the end date the same as the start date. In the case of re-occurring appointment into the future, you can leave the end date as empty or any far into the future value.
Now for all appointments, you have an addition column called: type and pattern.
In this column, you can have an enum kind of values specifying the repetition pattern such as daily, weekly, monthly, bi-weekly. For appointments that are not re-occurring, the type can be one-time.
This also deals with the case where an appointment is actually part of a re-occurring one, but was re-scheduled to be out of pattern. You set the type of it into something such as re-schedule
can possibly have an additional reference column to refer to the original re-occurring appointment.
Now when you want to search for all appointments (of any types) to match your date range, you can use the start and end date together with the ‘type and pattern’ column.
A specific example for the case of search for appointments in the range of Oct 1st, 2014 to Dec 1st, 2014 is as follows:
select * from APP as A where
((A.sd >= $start AND A.sd <= $end) OR (A.ed <= $end AND A.ed >= $start) OR (A.sd <= $start AND A.ed >= $end) )
AND (A.type = 'Single' OR A.type = 'ReSchedule' OR inRange($start, $end, A.sd, A.type) )
Here you have to build your own function for inRange()
, the semantic is taking the $end
variable as an integer, minus the integer for A.sd
and divide by the pattern interval. Then you check if $end
minus the division’s mod is greater or equal to $start
InRange()
will slow down your query by looking at all raw matches. But this will be much better than your original design concept. You can either implement InRange()
by using PL/SQL or write an external method using the SQL query’s results to properly filter the raw matches.
The way I see it, the database should only be storing one kind of appointment. That appointment information should contain (amonst other things) start date, repetition pattern and end date.
The db shouldn’t have an appointment for each “recurring” one, just the initial one and its repetition pattern. The application on the other end will pull the appointments relevant to it and calculate the “recurring” ones based on the pattern. This means, of course, if a person wants all appointments that will occur in the next 30 days, they’ll pull all records with a start date less than now+30days and an end date greater than now.
If you really wanted the db to handle this, then perhaps the db could be made to do these calculations and return a temp table with these extra appointments.
Finally, when it comes time to create/update an appointment, the end date (if one is desired) should be populated with the calculated end date (explicitly set by the client end or implicitly calculated by the client end or database function, in the event the pattern specifies a number of occurances or some other pre-calculable criteria).