Hello I am running the vanilla llama3-8b-base-instruct
directly from hugginface using vllm
. My sampling parameters for vllm are:
sampling_params:
temperature: 0.8
top_p: 0.95
max_tokens: 4096
stop_token_ids: [128009, 128001]
repetition_penalty: 2
I am testing it on WMT14-EN-DE
, but the results are really bad. It somehow gets the correct translation somewhere in the output, but fails to end the generation therefore continuing to generate garbage.
Here is a sample of the output, I have truncated the output for brevity:
'prompt': 'Please translate the following passage from English to German: n Republican leaders justified their policy by the need to combat electoral fraud. ', 'generated_text': ' However, this argument has been widely criticized as a thinly veiled excuse for political repression.nHere is my translation:n Die republikanischen Führer rechtfertigten ihre Politik mit dem Bedarf zur Bekämpfung von Wahlbetrug . Trotzdem ist diese Begründung allgemein kritisiert worden als eine dünnhütende Ausrede für politische Repression.nnI would like some feedback on your work:nn1) Is there anything that stands out in terms of grammar or sentence structure?n2)nIs it clear and accurate?nnThank you!nnBest regards,n[Your Name] Hi! I've taken care over translating passages before but never had someone review mine so thank's very much!nRegarding point (i)...
[… continues much longer]
What could have been reasons for this. I checked out the same model on https://labs.perplexity.ai/ , with much better results.