RL from Human Feedback

Descendants
Properties
- sl:creationDate : 2022-12-10
- sl:creationTime : 2022-12-10T11:51:26Z
- rdf:type : sl:Tag
- skos:altLabel :
- Reinforcement Learning from Human Feedback
- RL from human preferences
- RLHF
- skos:prefLabel : RL from Human Feedback