I am currently evaluating two versions of GPT models, GPT-4 and GPT-4o, and I wanted to understnad from the community on which model is more reliable for text-only applications. I am not interested in image generation or document/speech analysis.
My focus is on the following things:
- Hyperparameters and their impact on performance
- Ability to avoid hallucinations
- Understanding and validity of output
which model is more reliable for generating accurate and coherent text based on your experience or any available benchmarks?