I am trying to use the BERTopic library with a custom text generation model using the transformers library. However, I am getting this RuntimeError. I have tried to specify the device as 0 (GPU) in the pipeline, but I still get this error. How can I resolve this issue?
Please help me understand what’s causing this error and how to fix it.
2024-08-30 10:44:11,684 - BERTopic - Dimensionality - Completed ✓
2024-08-30 10:44:11,688 - BERTopic - Cluster - Start clustering the reduced embeddings
/usr/local/lib/python3.10/dist-packages/joblib/externals/loky/backend/fork_exec.py:38: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
pid = os.fork()
2024-08-30 10:44:17,485 - BERTopic - Cluster - Completed ✓
2024-08-30 10:44:17,498 - BERTopic - Representation - Extracting topics from clusters using representation models.
0%| | 0/66 [00:08<?, ?it/s]
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
[<ipython-input-11-5511f129a54a>](https://localhost:8080/#) in <cell line: 16>()
14 )
15
---> 16 topics, probs = topic_model.fit_transform(docs, embeddings)
13 frames
[/usr/local/lib/python3.10/dist-packages/transformers/generation/logits_process.py](https://localhost:8080/#) in __call__(self, input_ids, scores)
351 @add_start_docstrings(LOGITS_PROCESSOR_INPUTS_DOCSTRING)
352 def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor) -> torch.FloatTensor:
--> 353 score = torch.gather(scores, 1, input_ids)
354
355 # if score < 0 then repetition penalty has to be multiplied to reduce the token probabilities
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA_gather)
And my code is:
from transformers import AutoTokenizer, pipeline
model = AutoModelForCausalLM.from_pretrained(
"TheBloke/zephyr-7B-alpha-GGUF",
model_file="zephyr-7b-alpha.Q4_K_M.gguf",
model_type="mistral",
gpu_layers=50,
hf=True
#context_length=512,
#max_new_tokens=512
)
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-alpha")
prompt = """<|system|>You are a helpful, respectful and honest assistant for labeling topics..</s>
<|user|>
I have a topic that contains the following documents:
[DOCUMENTS]
The topic is described by the following keywords: '[KEYWORDS]'."""
generator = pipeline(
model=model, tokenizer=tokenizer,
task='text-generation',
max_new_tokens=50,
repetition_penalty=1.1,
device=0
)
from bertopic.representation import TextGeneration
zephyr = TextGeneration(generator, prompt=prompt, doc_length=10,tokenizer="char")
representation_model = {"Zephyr": zephyr}