I have a usecase in which my entire RAG system would have to be hosted locally.
I would be running a local weaviate server along with my own huggingface embeddings model. Weaviate’s documentation is a weird labyrinth understandable by only those that have either worked with weaviate for over a couple of years or their developers themselves.
Here is what I want to do:
- Create an embeddings model with sentence-transformers library
from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim
#since this embeddings model runs some custom code, we need to pass in a sentence transfomrer client with permission to trust the remote code.
embeddings = SentenceTransformer('Alibaba-NLP/gte-large-en-v1.5', trust_remote_code=True)
- Use the above model to embed the documents and ingest them in the local weavite server running at 8080
- Use the same embeddings model to encode some query (“This is a test query”)
- Retrieve from weaviate db the relevant documents that are similar to the query
The most basic most canonical usecase for a vector db. However, much to my surprise, I cannot find weaviate documentation to do this. I am stuck with weaviate for the time being. Can you help me set up this use case