RL from Human Feedback
Descendants
Properties
- sl:creationDate : 2022-12-10
- sl:creationTime : 2022-12-10T11:51:26Z
- rdf:type : sl:Tag
- skos:altLabel :
- RLHF
- RL from human preferences
- Reinforcement Learning from Human Feedback
- skos:prefLabel : RL from Human Feedback