Giter Club home page Giter Club logo

cross-source-cross-domain-sentiment-analysis's Introduction

Cross-source-cross-domain-sentiment-analysis

This repository hold 2 pickle dictionaries (Python) containing labeled data for cross source cross domain sentiment analysis. The two files are related either to English texts or Italian written ones.

The Dataset_ENG is composed by:

  1. Amazon: it contains a sample of 75,000 reviews of different Amazon products (as lectronic devices, kitchen objects, clothes and house accessories) collected from January to February 2018 and written in English. Each review is accompanied by the its date, the short title and the sentiment (expressed in a 5-stars rating) defined by the user who wrote the review. For privacy issues, the user name is omitted.

  2. Tripadvisor: it contains a sample of 75,000 reviews English reviews about hotels, restaurants, cities downloaded from Tripadvisor.com between January and February 2018. Each review is accompanied by the its date, the short title and the sentiment (expressed in a 5-stars rating) defined by the user who wrote the review. For privacy issues, the user name is omitted.

  3. Facebook: it contains 5,782 English Facebook posts. The post are related only to specific public pages having a 5-start rating system. The sampled reviews performed from January to February 2018 are about several topics, namely universities, events, famous people, locals, parties, shops and cities. Each item in the collection is accompanied by the sentiment (expressed in a 5-stars rating) defined by the user. For privacy issues, the user name is omitted.

The Dataset_ITA is composed by:

  1. Amazon: it contains a sample of 75,000 reviews of different Amazon products (as lectronic devices, kitchen objects, clothes and house accessories) collected from January to February 2018 and written in Italian. Each review is accompanied by the its date, the short title and the sentiment (expressed in a 5-stars rating) defined by the user who wrote the review. For privacy issues, the user name is omitted.

  2. Tripadvisor: it contains a sample of 75,000 reviews reviews written in Italian about hotels, restaurants, cities downloaded from Tripadvisor.com between January and February 2018. Each review is accompanied by the its date, the short title and the sentiment (expressed in a 5-stars rating) defined by the user who wrote the review. For privacy issues, the user name is omitted.

  3. Facebook: it contains 1,077 Italian Facebook posts. The post are related only to specific public pages having a 5-start rating system. The sampled reviews performed from January to February 2018 are about several topics, namely universities, events, famous people, locals, parties, shops and cities. Each item in the collection is accompanied by the sentiment (expressed in a 5-stars rating) defined by the user. For privacy issues, the user name is omitted.

  4. Twitter: sample of 937 Italian tweets manually labeled. The sample was collected at April 2018 and it regards Italian television shows and other more general topics. Each review has a three class sentiment label among negative, neutral or positive.

If you use these datasets, please cite:

Zola, P., Cortez, P., Ragno, C., & Brentari, E. (2019). Social Media Cross-Source and Cross-Domain Sentiment Classification. International Journal of Information Technology & Decision Making.

Thank you!

cross-source-cross-domain-sentiment-analysis's People

Contributors

paolazola avatar

Stargazers

 avatar  avatar  avatar

Forkers

jinglishi0206

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.