Sentiment Analysis with ParsBERT
Sentiment analysis is the process of analyzing digital text to determine if the emotional tone of the message is positive, negative, or neutral. Today, companies have large volumes of text data like emails, customer support chat transcripts, social media comments, and reviews. Sentiment analysis tools can scan this text to automatically determine the author’s attitude towards a topic. Companies use the insights from sentiment analysis to improve customer service and increase brand reputation.
Dataset
Digikala Comments Dataset is all about comments in the Digikala website site. These comments are scraped from Digikala.com and have been labeled based on the stars people who had bought the products gave to them. Also, many of the comments are noisy and do not provide clean data and it is not such a reliable source by adding the second label to the data we can ensure a higher accuracy of our training data.
ParsBERT
ParsBERT is a monolingual language model based on Google's BERT architecture. This model is pre-trained on large Persian corpora with various writing styles from numerous subjects (e.g., scientific, novels, news, ...) with more than 3.9M documents, 73M sentences, and 1.3B words.
Getting Started
Prerequisites
In order to run this project you need to install the required packages:
!pip install transformers
!pip install hazm
!pip install clean-text[gpl]
Setup
Clone this repository to your desired folder:
cd your folder
git clone [email protected]:ZahraArshia/sentiment_analysis.git
Results
Author
Zahra
- GitHub: @githubhandle
- LinkedIn: LinkedIn
Contributing
Contributions, issues, and feature requests are welcome!
Feel free to check the issues page.
Show your support
If you like this project please give me a ⭐