Reinforcement learning from human feedback

In machine learning, reinforcement learning from human feedback (RLHF) is a technique to align an intelligent agent with human preferences. It involves training a reward model to represent preferences, which can then be used to train other models through reinforcement learning.

Source: Wikipedia — Reinforcement learning from human feedback (CC BY-SA 4.0)

Reinforcement learning from human feedback

In machine learning, reinforcement learning from human feedback (RLHF) is a technique to align an intelligent agent with human preferences. It involves training a reward model to represent preferences, which can then be used to train other models through reinforcement learning.

Source: Wikipedia "Reinforcement learning from human feedback" · CC BY-SA 4.0

Share this article: X · Bluesky
Privacy Policy