Thiết kế website giá rẻ

Question

I trained the llama3 model with the following code for a question and answer dataset (around 50 Q&A), but the retrained model does not give exact answers as per the trained dataset; either it gives a mix answer (combining multiple trained questions answers into a single) or it only gives exact answer to that question, when I ask the model to generate new content based on the question answer, it always gives the same answer (overfitting issue here). If the training loss is minimal, the model is overfitting; attempting to reduce the hyperparameter values yields incorrect results. I want the model to provide precise responses if the question is the same, and if the new question is about generating new content based on the answer to the trained question, the model must create new content. Though RAG was the best option for this question and answer(only 50 q&a’s)task because the responses are on the same topic and have similar content, similarity search extracts many questions answers into context. Retrieval is failing in that option.

Code:


from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

# 4bit pre quantized models we support for 4x faster downloading + no OOMs.
fourbit_models = [
    "unsloth/mistral-7b-v0.3-bnb-4bit",      # New Mistral v3 2x faster!
    "unsloth/mistral-7b-instruct-v0.3-bnb-4bit",
    "unsloth/llama-3-8b-bnb-4bit",           # Llama-3 15 trillion tokens model 2x faster!
    "unsloth/llama-3-8b-Instruct-bnb-4bit",
    "unsloth/llama-3-70b-bnb-4bit",
    "unsloth/Phi-3-mini-4k-instruct",        # Phi-3 2x faster!
    "unsloth/Phi-3-medium-4k-instruct",
    "unsloth/mistral-7b-bnb-4bit",
    "unsloth/gemma-7b-bnb-4bit",             # Gemma 2.2x faster!
] # More models at https://huggingface.co/unsloth

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/llama-3-8b-bnb-4bit",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)
model = FastLanguageModel.get_peft_model(
    model,
    r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
  )
  alpaca_prompt = """Below is a question with an answer that provides a clear explanation.

   ### Question:
   {}

   ### Response:
   {}
   """

   EOS_TOKEN = tokenizer.eos_token  # Must add EOS_TOKEN

   def formatting_prompts_func(examples):
    questions = examples["Question"]
    answers = examples["Answer"]
    texts = []
    for question, answer in zip(questions, answers):
        # Must add EOS_TOKEN, otherwise your generation will go on forever!
        text = alpaca_prompt.format(question, answer) + EOS_TOKEN
        texts.append(text)
    return {"text": texts}

   from datasets import load_dataset
   dataset = load_dataset("csv", data_files="training-data.csv")
   dataset = dataset.map(formatting_prompts_func, batched=True)
   from trl import SFTTrainer
   from transformers import TrainingArguments
   from unsloth import is_bfloat16_supported

   trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset['train'],
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps = 90,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = 

type here

"outputs",
    ),
   )
    trainer_stats = trainer.train()

Thiết kế website giá rẻ

Danh mục

LLama3 model fine tunning issue