Loading...

Reward function

In reinforcement learning, the function that assigns a numeric score (“reward”) to behaviors, guiding the model toward preferred outcomes. Reward function design influences aligned behavior and can encode tradeoffs.

See: Alignment; Reinforcement Learning