I’m struggling with how to improve my queries. Here’s the scenario.
I have to report on bus promptness. I have bus positions, in the table bus_dynamic_history. I have bus journeys, in table bus_journeys. I have bus stops, in bus_timetables.
Table ‘bus_dynamic_history’ has a lot of records of bus positions, recorded as sdo_geometry points, but no indication of whether they’re near a bus stop or not. The table ‘bus_journeys’ joins with ‘bus_dynamic_history’ on ‘operatorcode’ and ‘linename’. Table ‘bus_timetables’ joins with ‘bus_journeys’ on ‘unique_journey’ and ‘runningboard’, and has an sdo_geometry point as a location.
I want to know whether a bus was early or late at a certain stop. So I’m trying to compare each bus position with that of a known bus stop, by querying whether it’s within 5 meters of that stop or not.
Here is the query I’m trying to run:
SELECT b.uniquejourney journey, a.journeyid, a.vehicleid, a.operatorcode, a.linename, c.atcocode, trunc(a.capturetime) cdate,
b.runningboard board, b.starttime, b.endtime, c.atcocode stop, sdo_geom.sdo_distance(a.location, c.location) meters, min(to_char(a.capturetime, 'HH24MI')) ctime
FROM bus_dynamic_history a, bus_journeys b, bus_timetables c
WHERE a.operatorcode = b.operator AND a.linename = b.linename AND b.uniquejourney = c.uniquejourney AND b.runningboard = c.runningboard
AND a.journeyid = b.starttime --AND a.operatorcode = 'ADER' AND linename = '38' --AND b.unique_journey = '11cs8'
AND c.stopflag = 'T'
and sdo_within_distance( c.location, a.location, 'distance = 5' ) = 'TRUE'
AND a.lastupdated BETWEEN '19/05/2024 00:00' AND '22/05/2024 23:59'
GROUP BY b.uniquejourney, a.journeyid, a.vehicleid, a.operatorcode, a.linename, trunc(a.capturetime), b.runningboard, b.starttime, b.endtime, c.atcocode, sdo_geom.sdo_distance(a.location, c.location)
I have indexes on:
bus_dynamic_history(lastupdated)
bus_dynamic_history(operatorcode)
bus_dynamic_history(location)
bus_journeys(uniquejourney)
bus_journeys(starttime)
bus_timetables(uniquejourney)
bus_timetables(location)
Here is the oracle explain plan
PLAN_TABLE_OUTPUT
Plan hash value: 801895608
| Id | Operation | Name |
| 0 | SELECT STATEMENT | |
| 1 | HASH GROUP BY | |
| 2 | FILTER | |
| 3 | HASH JOIN | |
| 4 | HASH JOIN | |
| 5 | TABLE ACCESS FULL | BUS_JOURNEYS |
| 6 | TABLE ACCESS FULL | BUS_TIMETABLES |
| 7 | TABLE ACCESS BY INDEX ROWID BATCHED| BUS_DYNAMIC_HISTORY |
| 8 | INDEX RANGE SCAN | BDH_LASTUPD |
Table ‘bus_dynamic_history’ has 40 million rows, ‘bus_journey’ has about 4500 and ‘bus_timetables’ has about 28000.
This query runs in about 15 minutes, but I need it to be a lot quicker.
I don’t understand why it’s doing a full table access on bus_journeys and bus_timetables.
I was hoping it would do an index scan … can anyone advise me on what I need to do to make it run in a not-crap length of time?
I have tried various indexes and permutations, but it’s not running fast enough. It takes about 15 minutes as it is, but I want it to be faster, and I can’t understand why it’s doing full table scans
Nigel Dams is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.