Au sujet de ce document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Amita Kamath
- sl:arxiv_num : 2006.09462
- sl:arxiv_published : 2020-06-16T19:13:21Z
- sl:arxiv_summary : To avoid giving wrong answers, question answering (QA) models need to know
when to abstain from answering. Moreover, users often ask questions that
diverge from the model's training data, making errors more likely and thus
abstention more critical. In this work, we propose the setting of selective
question answering under domain shift, in which a QA model is tested on a
mixture of in-domain and out-of-domain data, and must answer (i.e., not abstain
on) as many questions as possible while maintaining high accuracy. Abstention
policies based solely on the model's softmax probabilities fare poorly, since
models are overconfident on out-of-domain inputs. Instead, we train a
calibrator to identify inputs on which the QA model errs, and abstain when it
predicts an error is likely. Crucially, the calibrator benefits from observing
the model's behavior on out-of-domain data, even if from a different domain
than the test data. We combine this method with a SQuAD-trained QA model and
evaluate on mixtures of SQuAD and five other QA datasets. Our method answers
56% of questions while maintaining 80% accuracy; in contrast, directly using
the model's probabilities only answers 48% at 80% accuracy.@en
- sl:arxiv_title : Selective Question Answering under Domain Shift@en
- sl:arxiv_updated : 2020-06-16T19:13:21Z
- sl:bookmarkOf : https://arxiv.org/abs/2006.09462
- sl:creationDate : 2020-06-30
- sl:creationTime : 2020-06-30T10:59:53Z
Documents with similar tags (experimental)