how to change the code of chat completion task as part of official llama3 code for attention technique of llama3 to flash attention
I AM TRYING TO INTEGRATE TWO CODES META-LLAMA/LLAMA3 -8B-INSTRUCT MODEL WITH FLASHATTENTION.PY CODE IN https://github.com/shreyansh26/FlashAttention-PyTorch/blob/master/README.md TO REPLACE THE ATTENTION TECHNIQUE IN LLAMA3 AND DO CHatcompletion.py