I have some confusion about how composite indexes are working in oracle.
I have a big table with lot of columns including two main columns: ACTIONDATE (Date) and STATUS(char)
STATUS column can have only 3 values.
I have a composite index (ACTIONDATE,STATUS)- starting with ACTIONDATE.
My query is having a condition of type :
where ACTIONDATE>?A_DATE and STATUS=?A_STATUS
The documentation say that index will stop at first range scan condition (so here on the ACTIONDATE column) and the remaining columns in the index will be “useless”: but what I was thinking that still my STATUS column in the index will be very useless because when oracle is traversing index leaf node using range scan by the ACTIONDATE column, oracle will be able easily to filter rows not satisfying STATUS condition because the index has already that value and there won’t be a need to do a table lookup to check if we should filter or not this record. AM I correct here or what am I missing?
A second related question please about the exact signification of access predicate and filter predicate on a composite filter.
2
You are correct. If the optimizer uses the index on that range scan of actiondate
any predicate on status
will be applied at the index level, even though it cannot seek (binary tree search) on that value. That will reduce the block gets from the table. So it would be incorrect to say that any index column after an inequality column is “useless”.
You will get slightly better performance by switching your index column order to (STATUS,ACTIONDATE)
. Then it will seek on the STATUS
value and seek the starting position for your ACTIONDATE
, then range scan (pass through the leaf nodes) from there, which is fewer leaf blocks to scan than it has to with your current column order. The difference is not in table I/O, but index leaf block I/O, which is usually only a small portion of your total work, especially if those blocks are cached, which they often are. But either order will allow Oracle to satisfy both predicates before hitting the table, thus reducing table I/O, and that’s more important.
Keep in mind that depending on the date range you supply, index use may not be advisable/desirable. A large date range would be better off scanning the table. Usually date range scans on indexes use >
(more recent), not <
(older), and hence index use makes sense because you’re after only a tiny slice of the temporal pie. Typically <
will give you the bulk of a table where a full table scan is better. But if the combination of a particular status and an old date is rare due to business logic (an overdue invoice for example), then an index can be helpful even with <
. But I’d still put the STATUS
column first as long as you always plan to apply an equality filter on it.
1