We have the following class and corresponding DB table:
[Index(nameof(ItemId))]
public class DataSnapshot
{
[Key]
public long SeqNo { get; set; }
[Required]
public string ItemId { get; set; }
[Required]
public string Data { get; set; }
[Required]
public DateTimeOffset CreatedTimestamp { get; set; }
}
Whenever an item’s data is changed, a new row is inserted into the table. To retrieve the most recent data for all items, we currently execute the following query:
var latestDataSnapshots = context.DataSnapshots
.GroupBy(x => x.ItemId)
.Select(x => x.OrderByDescending(y => y.CreatedTimestamp).First())
.ToListAsync();
This query is translated into the following SQL:
SELECT "t0"."SeqNo",
"t0"."CreatedTimestamp",
"t0"."ItemId",
"t0"."Data"
FROM (SELECT "e"."ItemId"
FROM "DataSnapshots" AS "e"
GROUP BY "e"."ItemId") AS "t"
LEFT JOIN (SELECT "t1"."SeqNo",
"t1"."CreatedTimestamp",
"t1"."ItemId",
"t1"."Data"
FROM (SELECT "e0"."SeqNo",
"e0"."CreatedTimestamp",
"e0"."ItemId",
"e0"."Data"
ROW_NUMBER() OVER(PARTITION BY "e0"."ItemId" ORDER BY "e0"."CreatedTimestamp" DESC) AS "row"
FROM "DataSnapshots" AS "e0") AS "t1"
WHERE "t1"."row" <= 1) AS "t0"
ON "t"."ItemId" = "t0"."ItemId"
This query works as desired but, unfortunately (but not surprisingly), the query’s performance takes a hit as the number of rows in the table increases (sometimes the query even times out).
Can anyone suggest anything that we can do to make this query faster/more efficient? Any and all help appreciated.
We have complete end to end control of both the DB and C# code. We are using .NET 6.0 and latest Entity Framework 6.0.0 packages.