We fine tuned a model using MLX and successfully save the model, check this link for more details.
The generated model works so far so good with command like mlx_lm.generate --model new_model --prompt "tell me sth about sql" --temp 0.01 --ignore-chat-template
.
However, after converting it to gguf format and accessed through Ollama, the output just varies and not as expected.
The procedure to convert it to gguf is like:
python llama.cpp/convert_hf_to_gguf.py path/new_model --outfile path/new_model.gguf
Create modelfile with content like:
FROM ./new_model.gguf
# sets the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 0.01
Use Ollama to create the final artifact:
ollama create new_model -f modelfile
Launch ollama with ollama run new_model
and evaluate it.
Any comment is welcomed, thanks.