Relative Content

Tag Archive for large-language-modelfine-tuning

Fine Tunning LLM for Item Bank

What is the best approach to fine tuning LLMs like BERT and ChatGPT for use as an expansive item bank for only middle and high school curriculum subjects? Not the specifics, just the approach. I am probably just going to use one shot and follow the directions. I am building a novel intelligent tutoring systems where this will be the expert domain for subject mastery for the bots to pull from.

Fine Tunning LLM for Item Bank

What is the best approach to fine tuning LLMs like BERT and ChatGPT for use as an expansive item bank for only middle and high school curriculum subjects? Not the specifics, just the approach. I am probably just going to use one shot and follow the directions. I am building a novel intelligent tutoring systems where this will be the expert domain for subject mastery for the bots to pull from.

Fine Tunning LLM for Item Bank

What is the best approach to fine tuning LLMs like BERT and ChatGPT for use as an expansive item bank for only middle and high school curriculum subjects? Not the specifics, just the approach. I am probably just going to use one shot and follow the directions. I am building a novel intelligent tutoring systems where this will be the expert domain for subject mastery for the bots to pull from.

Finetuned Insrtuct model does not ahere to prompt if it’s different from the prompt it was trained on

hope this finds you sound.
I’m finetuning an instruct model(mistral 7B) with a 500 row dataset that has instruction, input and explanation. During training, my prompt consisted of the instruction and the input. In my dataset, all instructions are the same. The output has the following format – step-by-step analysis, strategy summary and conclusion. Now once finetuned, the model does quite well to explain the input but the moment I ask it to do something else it does not adhere to the prompt, especially when my instruction is different by input has the same format as before.
Example –

Technical Assistance Needed for Out of Memory Error in Fine-Tuning Llama3 Model

I’m encountering a persistent issue while fine-tuning a Llama3 model using a Hugging Face dataset in a Windows Subsystem for Linux (WSL) Ubuntu 22.04 environment on Windows 11. Despite having ample GPU memory available (detailed specs provided below), I repeatedly face the error: “torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU”. This occurs despite having two GPUs with sufficient memory.