A variant of RLHF using AI-generated feedback rather than human feedback to reduce cost and scale. RLAIF may have different alignment properties than human feedback and is often combined with Constitutional AI approaches.
See: Alignment; Constitutional AI