bequiet-log
🛡️Reinforcement Learning

(1) RLHF LLM, Alignment in Deep Learning: Comprehensive Summary