I’ve got a document database with about 6,000 records. I’ve successfully used vector database backends for RAG queries with these records. I’d like to move to from a vector-specific database to Google’s Spanner database.
My Spanner query looks like this:
SELECT a.id, a.title, cosine_distance(embeddings,
(
SELECT embeddings.VALUES
FROM ml.predict(model embeddingsmodel,
(
SELECT 'What is the meaning of life?' AS content)))) as distance
from `documents` as a
order by distance
This query typically takes over 30 seconds. How can I get this to work faster? The planner for spanner says most of the time is in ml.predict
and something called “merge distributed union”.
Other info:
Embedding model:
CREATE MODEL EmbeddingsModel
INPUT(content STRING(MAX))
OUTPUT(
embeddings
STRUCT<
statistics STRUCT<truncated BOOL, token_count FLOAT64>,
values ARRAY<FLOAT64>>
)
REMOTE OPTIONS (
endpoint = '//aiplatform.googleapis.com/projects/<myproject-id>/locations/<mylocation>/publishers/google/models/textembedding-gecko@003'
);
Documents table:
CREATE TABLE
content (id STRING(36) DEFAULT (GENERATE_UUID()),
text STRING(MAX),
title STRING(MAX),
embeddings ARRAY<FLOAT64>,
)
PRIMARY KEY
(id);