My implementation for the AutoModel AutoTokenizer classes are fairly simple:
`from transformers import AutoModel, AutoTokenizer
import numpy as np
from rank_bm25 import BM25Okapi
from sklearn.neighbors import NearestNeighbors
class EmbeddingModels:
def bert(self, model_name, text):
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
embeddings = outputs.last_hidden_state.mean(dim=1).detach().numpy()
return embeddings
def create_chunks(self, text, chunk_size):
return [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]`
But I can’t get this warning to go away:
A parameter name that contains 'beta' will be renamed internally to 'bias'. Please use a different name to suppress this warning. A parameter name that contains 'gamma' will be renamed internally to 'weight'. Please use a different name to suppress this warning.
There is no reference to the word beta or gamma anywhere in my repo.
Updating the package, suppressing the warnings with import warnings
danishsayed is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.