IBM Technology
Reinforcement Learning from Human Feedback (RLHF) Explained
1 year ago - 11:29
HuggingFace
Reinforcement Learning from Human Feedback: From Zero to chatGPT
Streamed 2 years ago - 1:00:38
Stanford Online
Stanford CS224N | 2023 | Lecture 10 - Prompting, Reinforcement Learning from Human Feedback
1 year ago - 1:16:15
Serrano.Academy
Reinforcement Learning with Human Feedback (RLHF) - How to train and fine-tune Transformer Models
1 year ago - 15:31
Sebastian Raschka
Reinforcement Learning with Human Feedback (RLHF) in 4 minutes
6 months ago - 4:06
Umar Jamil
Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code.
1 year ago - 2:15:13
Shaw Talebi
Fine-tuning LLMs on Human Feedback (RLHF + DPO)
5 months ago - 28:53
CodeEmporium
Reinforcement Learning through Human Feedback - EXPLAINED! | RLHF
1 year ago - 10:17
SCALER
How RLHF Creates Human-Like AI
6 months ago - 0:57
Stanford Online
Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 15: Alignment - SFT/RLHF
1 month ago - 1:14:51
AI Free Forever
What is Reinforcement Learning from Human Feedback (RLHF)? Explained with Simple Examples
3 months ago - 5:20
Mark Hennings
RLHF & DPO Explained (In Simple Terms!)
1 year ago - 19:39
Serrano.Academy
Proximal Policy Optimization (PPO) - How to train Large Language Models
1 year ago - 38:24
DataMListic
RLHF: Training Language Models to Follow Instructions with Human Feedback - Paper Explained
1 year ago - 20:28
HiDevs
How AI Learns from Us: The Power of RLHF
4 months ago - 0:31
The AI Standard
AI Learns to Talk Like Humans: RLHF Explained!
3 months ago - 0:34
AI Foundation Learning
Reinforcement Learning from Human Feedback (RLHF) - Beginners Guide | AI Foundation Learning
1 year ago - 6:25
AssemblyAI
The "RLHF effect" on LLMs
1 year ago - 0:59
Logan Turing
How RLHF Teaches AI What We Want
5 days ago - 1:01
AssemblyAI
RLAIF vs. RLHF: the technology behind Anthropic’s Claude (Constitutional AI Explained)
2 years ago - 5:54
RAIL
CS 285: Eric Mitchell: Reinforcement Learning from Human Feedback: Algorithms & Applications
1 year ago - 54:29
Whispering AI
🐐Llama 3 Fine-Tune with RLHF [Free Colab 👇🏽]
2 years ago - 14:30
DeepLearningAI
Mastering RLHF with AWS: A Hands-on Workshop on Reinforcement Learning from Human Feedback
Streamed 2 years ago - 1:01:01
Harper Carroll AI
How RLHF, Reinforcement Learning from Human Feedback, Works #ai#learnai#artificialintelligence#learn
1 year ago - 0:58
AI rules the world
LLM alignment (RLHF) DPO V.S. PPO which one is better? This paper finds out #llm #ai #rlhf #nlp
1 year ago - 0:31
AI Insight News
Unlocking the Power of RLHF: Creating AI Models that People Love
2 years ago - 2:28
Julia Turc
Proximal Policy Optimization (PPO) for LLMs Explained Intuitively
5 months ago - 22:03