I am trying to Finetune model.
Here is the train-test split of my dataset – Train – 4746 (80%)
Test – 1188 (20%)
Here is my code snippet:
training_args = TrainingArguments(
bf16=True, # specify bf16=True instead when training on GPUs that support bf16
do_eval=True,
evaluation_strategy="epoch", #ch
gradient_accumulation_steps=64,
gradient_checkpointing=True, #ch
gradient_checkpointing_kwargs={"use_reentrant": False},
learning_rate=2.0e-05,
log_level="info",
logging_steps=5,
logging_strategy="steps",
lr_scheduler_type="cosine", #ch
max_steps=-1,
num_train_epochs=3,
output_dir=output_dir,
overwrite_output_dir=True,
per_device_eval_batch_size=2, # originally set to 8
per_device_train_batch_size=2, # originally set to 8
save_strategy="no", #ch
save_total_limit=None,
seed=42,
)
peft_config = LoraConfig(
r=64,
lora_alpha=16,
lora_dropout=0.1,
bias="none",
task_type="CAUSAL_LM",
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
)
trainer = SFTTrainer(
model=model_id,
model_init_kwargs=model_kwargs,
args=training_args,
train_dataset=train_set,
eval_dataset=val_set,
dataset_text_field="text",
tokenizer=tokenizer,
packing=True,
peft_config=peft_config,
max_seq_length=tokenizer.model_max_length,
)
But when I run training, this is what is logged –
***** Running training *****
Num examples = 1,060
Num Epochs = 3
Instantaneous batch size per device = 2
Total train batch size (w. parallel, distributed & accumulation) = 128
Gradient Accumulation steps = 64
Total optimization steps = 24
Number of trainable parameters = 54,525,952
***** Running Evaluation *****
Num examples = 264
Batch size = 2
Why are my actual train-test sizes not matching with the logged ones?