About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Jan Kocoń
- sl:arxiv_num : 2302.10724
- sl:arxiv_published : 2023-02-21T15:20:37Z
- sl:arxiv_summary : OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and
revolutionized the approach in artificial intelligence to human-model
interaction. The first contact with the chatbot reveals its ability to provide
detailed and precise answers in various areas. There are several publications
on ChatGPT evaluation, testing its effectiveness on well-known natural language
processing (NLP) tasks. However, the existing studies are mostly non-automated
and tested on a very limited scale. In this work, we examined ChatGPT's
capabilities on 25 diverse analytical NLP tasks, most of them subjective even
to humans, such as sentiment analysis, emotion recognition, offensiveness and
stance detection, natural language inference, word sense disambiguation,
linguistic acceptability and question answering. We automated ChatGPT's
querying process and analyzed more than 38k responses. Our comparison of its
results with available State-of-the-Art (SOTA) solutions showed that the
average loss in quality of the ChatGPT model was about 25% for zero-shot and
few-shot evaluation. We showed that the more difficult the task (lower SOTA
performance), the higher the ChatGPT loss. It especially refers to pragmatic
NLP problems like emotion recognition. We also tested the ability of
personalizing ChatGPT responses for selected subjective tasks via Random
Contextual Few-Shot Personalization, and we obtained significantly better
user-based predictions. Additional qualitative analysis revealed a ChatGPT
bias, most likely due to the rules imposed on human trainers by OpenAI. Our
results provide the basis for a fundamental discussion of whether the high
quality of recent predictive NLP models can indicate a tool's usefulness to
society and how the learning and validation procedures for such systems should
be established.@en
- sl:arxiv_title : ChatGPT: Jack of all trades, master of none@en
- sl:arxiv_updated : 2023-02-21T15:20:37Z
- sl:bookmarkOf : https://arxiv.org/abs/2302.10724
- sl:creationDate : 2023-02-22
- sl:creationTime : 2023-02-22T13:41:17Z
Documents with similar tags (experimental)