I am using the qdrant DB and client for embedding a document as part of a PoC that I am working on in building a RAG.
I see that when I use a Manhattan distance to build the vector collection I get a high score than when I use the Cosine distance. However, the text chunk returned is the same. I am not able to understand why and how? I am learning my ropes here at RAG still. Thanks in advance.
USER QUERY
What is DoS?
COSINE DISTANCE
response: [
ScoredPoint(id=0,
version=10,
score=0.17464592,
payload={
'chunk': "It also includes overhead bytes for operations,
administration, and maintenance (OAM) purposes.nOptical Network Unit
(ONU)nONU is a device used in Passive Optical Networks (PONs). It converts
optical signals transmitted via fiber optic cables into electrical signals that
can be used by end-user devices, such as computers and telephones. The ONU is
located at the end user's premises and serves as the interface between the optical
network and the user's local network."
},
vector=None, shard_key=None)
]
MANHATTAN DISTANCE
response: [
ScoredPoint(id=0,
version=10,
score=103.86209,
payload={
'chunk': "It also includes overhead bytes for operations, administration,
and maintenance (OAM) purposes.nOptical Network Unit
(ONU)nONU is a device used in Passive Optical Networks (PONs). It converts
optical signals transmitted via fiber optic cables into electrical signals that
can be used by end-user devices, such as computers and telephones. The ONU is
located at the end user's premises and serves as the interface between the optical
network and the user's local network."
},
vector=None, shard_key=None)
]