Reinforcement learning from human feedback
In machine learning, reinforcement learning from human feedback (RLHF) is a technique to align an intelligent agent with human preferences. It involves training a reward model to represent preferences, which can then be used to train other models through reinforcement learning.
Source: Wikipedia — Reinforcement learning from human feedback (CC BY-SA 4.0)