Thiết kế website giá rẻ

Question

Speeding up initial ltree, GiST-indexed queries in Postgres

Setup

I’m experimenting with ltree and pattern searching. It’s pretty great, but I’m finding that first runs on searches are very slow compared to later runs. I don’t know if this is a general issue around loading and caching index pages, or something more specific to GiST indexes.

For background, I’m trying a few different ideas out in a Materialized View with the following column list:

+-------------------------------------------------+-----------+
| column_name                                     | data_type |
+-------------------------------------------------+-----------+
| edge_sequence                                   | uuid      |
| total_seconds                                   | bigint    |
| source_node_count                               | bigint    |
| node_count                                      | integer   |
| nodes_without_adjacent_duplicates_count         | integer   |
| distinct_states_count                           | integer   |
| has_duplicate_adjacent_events                   | boolean   |
| has_circular_links                              | boolean   |
| state_names                                     | citext[]  |
| state_names_ltree                               | ltree     |
| state_names_ltree_text                          | text      |
| state_names_without_duplicate_adjacent_elements | citext[]  |
| distinct_state_names_ltree                      | ltree     |
| state_symbols                                   | text[]    |
| state_symbols_ltree                             | ltree     |
| state_symbols_cstring                           | text      |
+-------------------------------------------------+-----------+

Here’s the GiST index definition for the state_names_ltree field:

DROP INDEX IF EXISTS edge_cycle_state_names_ltree_gist;
CREATE INDEX edge_cycle_state_names_ltree_gist
          ON analytics.edge_cycle
       USING GIST (state_names_ltree 
                   gist_ltree_ops(siglen=100) -- siglen defaults to 8, can be set as high as 2024, in 4-byte increments
                  )
        WITH (fillfactor = 100) -- The data will be in a Materialized View, or TRUNCATE-and-rebuild table.
       WHERE node_count <= 24;  -- Index entries are too long for very long chains (hundreds+)

Query and Plans

This query uses the index to find 581 rows out of the 7,745,994 in the data set:

SELECT edge_sequence,state_names_ltree 
  FROM edge_cycle 
 WHERE state_names_ltree ~ 'WaitingAfterSterilizer.*.Decon.*.Sterilizer|VerifyingAfterSterilizer'::lquery
   AND node_count <= 24;  -- IMPORTANT, the index we want the planner to pick includes this condition.

Here’s the plan on a cold cache:

Index Scan using edge_cycle_state_names_ltree_gist on edge_cycle  (cost=0.41..186.90 rows=159 width=180) (actual time=184.459..126255.450 rows=581 loops=1)
  Index Cond: (state_names_ltree ~ 'WaitingAfterSterilizer.*.Decon.*.Sterilizer|VerifyingAfterSterilizer'::lquery)
  Buffers: shared hit=21 read=215134
  I/O Timings: shared read=124327.468
Planning:
  Buffers: shared hit=217 read=14
  I/O Timings: shared read=10.586
Planning Time: 11.474 ms
Execution Time: 126255.839 ms

Here’s the plan on an early run, but not on a completely cold cache, where I cancelled the search before it finished. So, a luke-warm cache:

Index Scan using edge_cycle_state_names_ltree_gist on edge_cycle  (cost=0.41..186.90 rows=159 width=180) (actual time=1.608..62277.936 rows=581 loops=1)
  Index Cond: (state_names_ltree ~ 'WaitingAfterSterilizer.*.Decon.*.Sterilizer|VerifyingAfterSterilizer'::lquery)
  Buffers: shared hit=102691 read=112443
  I/O Timings: shared read=60442.362
Planning:
  Buffers: shared hit=1
Planning Time: 0.246 ms
Execution Time: 62278.214 ms

And now the plan when running the same query again immediately after letting a slow-and-load query finish:

Index Scan using edge_cycle_state_names_ltree_gist on edge_cycle  (cost=0.41..186.90 rows=159 width=180) (actual time=1.501..1505.078 rows=581 loops=1)
  Index Cond: (state_names_ltree ~ 'WaitingAfterSterilizer.*.Decon.*.Sterilizer|VerifyingAfterSterilizer'::lquery)
  Buffers: shared hit=215134
Planning:
  Buffers: shared hit=1
Planning Time: 0.232 ms
Execution Time: 1505.176 ms

At the extremes, the query speed is around 84x different. This difference seems to be in the read part of the buffers information:

  Buffers: shared hit=21 read=215134
  I/O Timings: shared read=124327.468

  Buffers: shared hit=102691 read=112443
  I/O Timings: shared read=60442.362

  Buffers: shared hit=215134

So, it looks like the speed hit is coming from loading the index pages.

Speeding Things Up

I’ve had a look around for what settings I might use to speed up the initial search, but haven’t found anything promising yet. I’m not at all well versed in how to tune the Postgres cache. Either on the hardware or software configuration side. It’s easy to find mentions to GiST indexes being slower than other index types but, in this case, it’s plenty fast enough. Once the pages are loaded. I’ve been using a pg_trgm GiST index for years, and it is fantastic.

Hardware, Version and Settings

We’re running 16.4 on an RDS db.m5.2xlarge instance, which has a vCPU count of 8, and RAM of 32 GB. Here are the cache-related settings I found:

select name,setting,unit from pg_settings where name like '%cache%' order by 1;

/*
+----------------------+---------+------+
| name                 | setting | unit |
+----------------------+---------+------+
| debug_discard_caches | 0       | NULL |
| effective_cache_size | 2015397 | 8kB  |
| plan_cache_mode      | auto    | NULL |
+----------------------+---------+------+
*/

effective_cache_size is the only setting here that looks relevant. And, unless I’m messing up my arithmetic, that’s a pretty large cache out of 32GB?

I’ve tried changing FILLFACTOR to 100%

While writing this up, I considered the index FILLFACTOR. The default for B-trees is 90%, I don’t know what the default is for GiST indexes. This data will be in a Materialized View, or wipe-and-replace table. meaning it doesn’t have to contend with updates. A GiST, without an explicit override, gets you NULL in reloptions:

SELECT pg_class.relname      AS table_name, 
       pg_class.reloptions
       
  FROM pg_class 
  
  JOIN pg_namespace  
    ON pg_namespace.oid = pg_class.relnamespace

WHERE pg_class.relname = 'edge_cycle_state_names_ltree_gist';

I ran an experiment, and compared the size of the index with the default fill, and then a 100% fill. It looks like GiST is also probably at a 90% FILLFACTOR, at least when using gist_ltree_ops:

select pg_size_pretty(pg_relation_size('edge_cycle_state_names_ltree_gist')); -- Default fill: 2,908 MB
select pg_size_pretty(pg_relation_size('edge_cycle_state_names_ltree_gist')); -- 100% fill:    2,601 MB

select 2601/2908::real; -- 0.894

Other Ideas I’ve Considered

Don’t Search Against a GiST Index

ltree requires a GiST index, if you want indexed pattern searching. But here’s what I get when I bypass the index by not including the node_count <= 24 condition in the query:

Gather  (cost=1000.00..1092064.13 rows=159 width=180) (actual time=7.350..2187.419 rows=613 loops=1)
  Workers Planned: 4
  Workers Launched: 4
  ->  Parallel Seq Scan on edge_cycle  (cost=0.00..1091048.23 rows=40 width=180) (actual time=15.969..2180.520 rows=123 loops=5)
        Filter: (state_names_ltree ~ 'WaitingAfterSterilizer.*.Decon.*.Sterilizer|VerifyingAfterSterilizer'::lquery)
        Rows Removed by Filter: 1549076
Planning Time: 0.348 ms
Execution Time: 2187.555 ms

That’s pretty decent. Slower than a fast GiST query, faster than a slow initial GiST query, more consistent to naive user. (Where “naive” = “any normal person who is not going to get into the weeds.”)

Experimenting with different `siglen` values.

The range is 2-2024, in 4 byte increments, with a default of 8. I set it to 100. I came up with that number by throwing a dart at a board. Longer siglen means larger index nodes, means more index pages, means slower initial load. Or so it seems. I can explore with a shorter siglen. Any real-world suggestions on how to tune the siglen would save me lot of trouble.

Reducing the Node Size with Symbols

In theory, an ltree path can store up to 65,535 labels. In reality, it blows up when the full label path is somehwere under 12,000 bytes. Our raw paths are relatively short (I’m only including paths with lengths of 24 nodes or less, usually 7 or less), but the labels are relatively long. There are only around 60 distinct labels, so I can “symbolize” them as 2-char hex values. For example:

WaitingAfterSterilizer.Decon.Sterilizer

AO.01.04

The symbolized version is completely unreadable, but makes the nodes smaller. Meaning, more nodes in a page, less time loading pages. I’ve compared savings on index size recently, and the index size correlates with the path links, but not proportionally. I guess there’s some kind of overhead/price of admission, no matter what the data. After that, the data itself does make a difference. So, in my cases, I can save 30-60% of the index by converting the labels to shortened versions.

( haven’t explored the symbolization yet much for this loading problem, but smaller is smaller. I set up the symbolization system to simplify experimenting with NetworkX in Python. Also, to run some text similarity experiments in Postgres. (Levenshtein does surprisingly well on the symbolized paths, considering. Trigrams work poorly, which is what you would expect.) But, for ltree path searches, I’d need to write an interface to translate paths into symbolized paths and….ugh. Even for me, that seems over-the-top. I guess that I could write a set-returning function that takes an ltree, parses it out, substitutes labels with their symbols, and runs the query. Pure SQL for this function should be possible. It could return all column, as the planner seems quite smart about pruning out columns that aren’t included in the outer query.

I’ve wandered around a bit here, but I’m unclear which of the various details and ideas included may be relevant. I’ll be grateful for any suggestions or warnings.

Danh mục