Giter Club home page Giter Club logo

chatgpt-sentiment-evaluation's Introduction

Is ChatGPT a Good Sentiment Analyzer?

Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study [arXiv:2304.04339]

In this repo, we release the test sets we used for evaluation in our paper.

Introduction (TL;DR)

Recently, ChatGPT has drawn great attention from both the research community and the public. However, despite its huge success, we still know little about the capability boundaries, i.e., where it does well and fails. We are particularly curious how ChatGPT performs on the sentiment analysis tasks, i.e., Can it really understand the opinions, sentiments, and emotions contained in the text?

To answer this question, we conduct a preliminary evaluation on 5 representative sentiment analysis tasks and 18 benchmark datasets, which involves four different settings including standard evaluation, polarity shift evaluation, open-domain evaluation, and sentiment inference evaluation. We compare ChatGPT with fine-tuned BERT-based models and corresponding SOTA models on each task for reference.

Through rigorous evaluation, our findings are as follows:

  1. ChatGPT exhibits impressive zero-shot performance in sentiment classification tasks and can rival fine-tuned BERT, although it falls slightly behind the domain-specific fullysupervised SOTA models.
  2. ChatGPT appears to be less accurate on sentiment information extraction tasks such as E2E-ABSA. Upon observation, we find that ChatGPT is often able to generate reasonable answers, even though they may not strictly match the textual expression. From this point of view, the exact matching evaluation in information extraction is not very fair for ChatGPT. In our human evaluation, ChatGPT can still perform well in these tasks.
  3. Few-shot prompting (i.e., equipping with a few demonstration examples in the input) can significantly improve performance across various tasks, datasets, and domains, even surpassing fine-tuned BERT in some cases but still being inferior to SOTA models.
  4. When coping with the polarity shift phenomenon (e.g., negation and speculation), a challenging problem in sentiment analysis, ChatGPT can make more accurate predictions than fine-tuned BERT.
  5. Compared to the conventional practice - training domain-specific models, which typically perform poorly when generalized to unseen domains, ChatGPT demonstrates its powerful open-domain sentiment analysis ability in general, yet it is still worth noting that its performance is quite limited in a few specific domains.
  6. ChatGPT exhibits impressive sentiment inference ability, achieving comparable performance on the emotion cause extraction task or emotion-cause pair extraction task, in comparison with the fully-supervised SOTA models we set up.

In summary, compared to training a specialized sentiment analysis system for each domain or dataset, ChatGPT can already serve as a universal and well-behaved sentiment analyzer.

Citation

If you find this work helpful, please cite our paper as follows:

@article{wang2023chatgpt-sentiment,
  title={Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study},
  author={Zengzhi Wang and Qiming Xie and Zixiang Ding and Yi Feng and Rui Xia},
  journal={arXiv preprint},
  year={2023}
}

If you have any questions related to this work, you can open an issue with details or feel free to email Zengzhi([email protected]), Qiming([email protected]).

Evaluation

Standard Evaluation

Zero-shot Results

Human Evaluation (still in zero-shot)

Few-shot Prompting

Polarity Shift Evaluation

Open-Domain Evaluation

Sentiment Inference Evaluation

We choose the ECE and ECPE tasks as the testbed.

Case Study

Standard Evaluation

Polarity Shift Evaluation

Open-Domain Evaluation

Sentiment Inference Evaluation

Emotion Cause Extraction (ECE)

Emotion-Cause Pair Extraction (ECPE)

Note that the right part is the English version translation of the left part for both ECE and ECPE.

chatgpt-sentiment-evaluation's People

Contributors

grayground avatar sinclaircoder avatar balancedzx avatar rxiacn avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.