Relative Content

Tag Archive for pythonpytorchnlphuggingface-transformerslarge-language-model

Is it necessary for torch_dtype when loading a model and the precision for trainable weights to be different? If so, why?

According to this comment in the huggingface/peft package, if a model is loaded in fp16, the trainable weights must be cast to fp32. From this comment, I understand that generally, the torch_dtype used when loading a model and the precision used for training must be different. Why is it necessary to change the precision? Also, does this principle apply to both fine-tuning and continual pretraining?

Is it necessary for torch_dtype when loading a model and the precision for trainable weights to be different? If so, why?

How to handle the loss function decrease at some points but bounce back to keep increasing when fine-tune LLaMA?

Preface

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for pythonpytorchnlphuggingface-transformerslarge-language-model

Is it necessary for torch_dtype when loading a model and the precision for trainable weights to be different? If so, why?

Is it necessary for torch_dtype when loading a model and the precision for trainable weights to be different? If so, why?

Is it necessary for torch_dtype when loading a model and the precision for trainable weights to be different? If so, why?

How to handle the loss function decrease at some points but bounce back to keep increasing when fine-tune LLaMA?