I have a redis cache using Redis Search and an HNSW index on a 512 element vector of float32 values.
It is defined like this:
schema = (
VectorField(
"vector",
"HNSW",
{
"TYPE": "FLOAT32",
"DIM": 512,
"DISTANCE_METRIC": "IP",
"EF_RUNTIME": 400,
"EPSILON": 0.4
},
as_name="vector"
),
)
definition = IndexDefinition(prefix=[REDIS_PREFIX], index_type=IndexType.HASH)
res = client.ft(REDIS_INDEX_NAME).create_index(
fields=schema, definition=definition
)
I can insert numpy float32 vectors into this index by writing the result of vector.tobytes()
into them directly. I can then accurately query those same vectors using a vector similarity search.
Despite this working correctly, when I read these vectors out of the cache using client.hget(key, "vector")
I get results that are a variable number of bytes. All of these vectors are definitely 512 elements when I insert them, but sometimes they come back as a number of bytes that isn’t even a multiple of 4! I can’t decode them back into a numpy vector at that point.
I can’t tell if this is a bug, or if I’m doing something wrong. Either way, something clearly isn’t right.