I’m running the below query in my Oracle DB. The index trunc(create_date) is created on the table ORDERS. But the below query is not using the index when I run it.
SELECT ITEM, MAX("Orders") AS OP_ORD, MAX("SHIP") AS SHIP
FROM ORDERS
WHERE trunc(create_date) in(
SELECT TRUNC(sysdate)-6-(select to_char(sysdate, 'd') from dual) FROM DUAL
union
SELECT TRUNC(sysdate)-5-(select to_char(sysdate, 'd') from dual) FROM DUAL
union
SELECT TRUNC(sysdate)-4-(select to_char(sysdate, 'd') from dual) FROM DUAL
union
SELECT TRUNC(sysdate)-3-(select to_char(sysdate, 'd') from dual) FROM DUAL
)
GROUP BY ITEM
But if I’m removing the last query from the subquery like below, then it will use the index
SELECT ITEM, MAX("Orders") AS OP_ORD, MAX("SHIP") AS SHIP
FROM ORDERS
WHERE trunc(create_date) in(
SELECT TRUNC(sysdate)-6-(select to_char(sysdate, 'd') from dual) FROM DUAL
union
SELECT TRUNC(sysdate)-5-(select to_char(sysdate, 'd') from dual) FROM DUAL
union
SELECT TRUNC(sysdate)-4-(select to_char(sysdate, 'd') from dual) FROM DUAL
)
GROUP BY ITEM
Any idea why this is not using the index?
6
Index use is not always the right choice. Oracle is trying to be intelligent, using indexes only when it thinks it faster than a full table scan. A date range is a classic situation where Oracle can easily flip between preferring one access method over the other, depending on the range and its knowledge of min/max date values in its statistics.
You’d be better off:
-
Replace the
TRUNC(create_date)
with just a normal index oncreate_date
, then rewrite your date logic (using comparison operators<
,>
,between
, etc.) to avoid using theTRUNC
operator on the column, as MT0 demonstrated. Oracle has a very hard time figuring out what to expect once you apply a function to a predicate column like that. -
If when Oracle prefers a full table scan you find that it’s slower than it should be, either update statistics (doesn’t always work) or hint the query:
SELECT /*+ INDEX(orders) */ ITEM, MAX("Orders") AS OP_ORD, MAX("SHIP") AS SHIP FROM ORDERS WHERE create_date . . .
-
If the full scan is in fact faster, then let Oracle do the full scan and thank it for being so smart.
-
For maximum speed when you have a very large table, range partition the table by
create_date
with an interval of 1 day, then drop your index. Those full scans will become full scans of only the partitions containing your dates. That’ll be much faster than using indexes for a query like this. You can then add some parallelism to the query to make those full partition scans screaming fast. Chances are you’ll rarely want to use an index on a date column again.
As a general rule when writing SQL, minimize your use of subqueries, especially nested ones, and avoid using TO_CHAR
to do date arithmetic. That’s really meant for visual presentation only. Also avoid TRUNC
unless it’s really necessary, and then only on the side opposite the column you are accessing.
1
Assuming that (based on your profile) your NLS_TERRITORY = 'India'
and you are trying to select rows from Sunday two weeks ago through to either Tuesday or Wednesday of last week then you can filter the results without using TRUNC
on the create_date
column or using IN
by, instead, filtering on a range:
SELECT ITEM_ID,
MAX("Orders") AS OP_ORD,
MAX("SHIP") AS SHIP
FROM ORDERS
WHERE create_date >= TRUNC(SYSDATE, 'IW') - INTERVAL '8' DAY
AND create_date < TRUNC(SYSDATE, 'IW') - INTERVAL '4' DAY
GROUP BY ITEM_ID
This would/may use an index on the underlying create_date
column rather than the function-based TRUNC(create_date)
index .
(Note: Using TRUNC(SYSDATE, 'IW')
will always give midnight of the most-recent Monday, the start of the ISO week, regardless of what settings the user has. TO_CHAR(SYSDATE, 'd')
will give different results depending on the NLS_TERRITORY
settings of the session for the user performing the query – i.e. two different users can get different results using TO_CHAR(SYSDATE, 'd')
for exactly the same query if their NLS_TERRITORY
session settings are different.)
You could equally write your query as:
SELECT ITEM_ID,
MAX("Orders") AS OP_ORD,
MAX("SHIP") AS SHIP
FROM ORDERS
WHERE TRUNC(create_date) IN (
TRUNC(SYSDATE, 'IW') - INTERVAL '8' DAY,
TRUNC(SYSDATE, 'IW') - INTERVAL '7' DAY,
TRUNC(SYSDATE, 'IW') - INTERVAL '6' DAY,
TRUNC(SYSDATE, 'IW') - INTERVAL '5' DAY
)
GROUP BY ITEM_ID
Which both give the same output irrespective of the NLS_TERRITORY
setting and give different EXPLAIN PLAN
s to your query – fiddle (the EXPLAIN PLAN
s are shown at the end of the fiddle).
Any idea why this is not using the index?
The optimiser will look at the query and determine an execution plan that, based on the available statistics and other heuristics, it determines is the “best”. For three days of data it may determine that filtering based on the index and then retrieving the corresponding rows which contain the other column data is most efficient. However, for four days, it may determine that, due to the increased volume of data, it is less efficient to use an index and is actually more efficient to just perform a table scan and ignore the index.
As Justin Cave points out in his comment:
“That [decision by the optimiser] may or may not be accurate. Perhaps statistics are stale and you’ve added a lot of data since you last gathered stats.”
You may then:
- need to gather new statistics on your table which may change the plan that the optimiser chooses; or
- accept that the optimiser may be correct and using an index is less efficient than a table scan; or
- try using hints to force the index to be used.