I’m currently learning on fine-tuning a Falcon LLM model so that it can be used for my use case. im using a free tier google colab environment for doing this. I’ve looked up internet tutorial but somehow it still overwhelming on how to do merge after training, then upload it to HF model repo
I’m using SFTTrainer for training and i’m assigning new variable which is peft_model
, see below.
model = prepare_model_for_kbit_training(model)
lora_alpha = 16 #16
lora_dropout = 0.05 #0.1
lora_rank = 32 #64
peft_config = LoraConfig(
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
r=lora_rank,
bias="none",
task_type="CAUSAL_LM",
target_modules=[
"query_key_value",
"dense",
"dense_h_to_4h",
"dense_4h_to_h",
]
)
peft_model = get_peft_model(model, peft_config)
I do see that the trainer is only training the adapters. while the actual model is still on the model
variable. is there any clear step on how to do it? the training checkpoint and result are on the /content/falcon7binstruct_ecommercebot
path, while the base model that i took is from “tiiuae/falcon-7b”