Chunking a Tokenized dataset
I am trying to experiment with the databricks-dolly-15k dataset to make it suitable for fine tuning a Llama2 model according to this article by Phil Schmid. The initial part of building the dataset is quite clear.
I am trying to experiment with the databricks-dolly-15k dataset to make it suitable for fine tuning a Llama2 model according to this article by Phil Schmid. The initial part of building the dataset is quite clear.