I have a training dataset of about 1500 samples, in 1 JSONL file. I tried to fine-tune chat-bison@002 model but none of the answers in the test prompt is desired. Even when I try to copy a short question in the samples to see how it gives response, the result is 10% desired at most…
For example, the sample in training file:
Question: Why do you have long hair?
Answer: I kinda like the feeling
of having long hair – it gives me a sense of connecting with my inner
femininity, and to feel like I want to take care something else on my
body.
In the test prompt for the tuned model:
Question: Why do you have long hair?
Answer: Having long hair make me
feel confident and free.
For context, I’m fine-tuning a chat-bison model (in the console, not through SDK/API) using my Vietnamese podcast content, the goal is for my audience to engage with me as if they talk directrly to me. The tuned model is expected to use my style/tone and content.
The training dataset comprises of many different questions on all kinds of topics in the podcast, with my own answer provided. Each question also has 3-5 variants of different wording to diversify cases.
I’m not sure what I have been doing wrong. Is there any pointer in my case? Could that be because the dataset is too small? (I think probably not) Or would it be because chat-bison@002 is not well-verse in other languages compared to English?
1