I am a first year bachelor’s student in AI, i have over 1 year of experience in basic ML/DL projects, I have started learning RL a few days ago and have been asked a project on RLHF – Reinforcement Learning using Human Feedback
the project is based in a local Large Language model development
how should I go about learning RLHF, give me any advice/topic list to cover
so far what I know in RL is only the basics: basics of MDP, and Q