Giter Club home page Giter Club logo

crosslanguagesentimentanalysis's Introduction

Each colom represent mean

ColumnMeanremark
F (6)ranking
G (7)Chinese origin data
H (8)Google translated data
I (9)Baidu Chinese sentiment analysis positive probabilitybase on Chinese origin data
J (10)Baidu Chinese sentiment analysis confidencebase on Chinese origin data
K (11)Baidu Chinese sentiment analysis Negative probabilitybase on Chinese origin data
L (12)Baidu Chinese sentiment analysis the categorybase on Chinese origin data
M (13)Baidu English sentiment analysis positive probabilitybase on Google translated data
N (14)Baidu English sentiment analysis confidencebase on Google translated data
O (15)Baidu English sentiment analysis Negative probabilitybase on Google translated data
P (16)Baidu English sentiment analysis the categorybase on Google translated data
Q (17)Google Chinese sentiment analysis scorebase on Chinese origin data
R (18)Google Chinese sentiment analysis manitudebase on Chinese origin data
S (19)Google English sentiment analysis scorebase on Google translated data
T (20)Google English sentiment analysis manitudebase on Google translated data
U (21)Yandex translated database on Chinese origin data
V (22)Google English sentiment analysis scorebase on Yandex translated data
W (23)Google English sentiment analysis manitudebase on Yandex translated data
X (24)Baidu translated data
Y (25)Google English sentiment analysis scorebase on Baidu translated data
Z (26)Google English sentiment analysis manitudebase on Baidu translated data
AA (27)Baidu English sentiment analysis postitive probabilitybase on baidu translated data
AB (28)Baidu English sentiment analysis confidencebase on baidu translated data
AC (29)Baidu English sentiment analysis Negative probabilitybase on baidu translated data
AD (30)Baidu English sentiment analysis the categorybase on baidu translated data
AE (31)Baidu postitive probability change to Google score standardbase on origin data Baidu sentiment analysis
AF (32)Baidu postitive probability tranform to Google score standardbase on column M
AG (33)Baidu English sentiment analysis postitive probabilitybase on Yandex translated data
AH (34)Baidu English sentiment analysis confidencebase on Yandex translated data
AI (35)Baidu English sentiment analysis Negative probabilitybase on Yandex translated data
AJ (36)Baidu English sentiment analysis the categorybase on Yandex translated data
AK (37)Baidu postitive probability change to Google score standardbase on column AG
AL (38)Baidu postitive probability change to Google score standardbase on column AA

note

  • Baidu sentiment analysis Category Note
    • 2 mean to belong to the positive category, 1 mean to belong to the neutral category, and 0 mean to belong to the Negative category
  • Google sentiment analysis Note
    • The score of a document’s sentiment indicates the overall emotion of a document. The magnitude of a document’s sentiment indicates how much emotional content is present within the document, and this value is often proportional to the length of the document.
    • A document with a neutral score (around 0.0) may indicate a low-emotion document, or may indicate mixed emotions, with both high positive and negative values which cancel each out. Generally, you can use magnitude values to disambiguate these cases, as truly neutral documents will have a low magnitude value, while mixed documents will have higher magnitude values.
    • “Clearly positive” and “clearly negative” sentiment varies for different use cases and customers. You might find differing results for your specific scenario. We recommend that you define a threshold that works for you, and then adjust the threshold after testing and verifying the results. For example, you may define a threshold of any score over 0.25 as clearly positive, and then modify the score threshold to 0.15 after reviewing your data and results and finding that scores from 0.15-0.25 should be considered positive as well.

Baidu sentiment analysis

Baidu Chinese sentiment analysis

Baidu Chinese sentiment analysis positive probability histogram

./img/BaiduPositiveProbababilityHistogramForOriginData.jpg

Baidu Chinese sentiment analysis postitive probability compare with different ranking(origin data)

RankingMeanValid Nstd.deviationTotal NMinimumMaximum
Ranking 100.23936596500085250.221941227000085720.0001061.000000
Ranking 200.292751426000131410.2357115580000132260.0001621.000000
Ranking 300.394234188210.273685189740.0002141.000000
Ranking 400.51199087170.30061887900.0010501.000000
Ranking 500.56898842710.31281543070.0005361.000000

./img/MarginalMeansOfBaiduPositiveProbabilityForOriginData.jpg

  • Baidu Chinese sentiment analysis positive probability values are valid.

Baidu Chinese sentiment analysis postitive probability tranform to Google Score standard compare with different ranking (origin data)

RankingMeanValid Nstd.deviationTotal NMinimumMaximumVariance
Ranking 10-0.59887585250.557595-0.9998941.0000000.310912
Ranking 20-0.488772131410.617021-0.9998381.0000000.380715
Ranking 30-0.236524188210.728420-0.9997861.0000000.530596
Ranking 400.05449387170.773410-0.9989501.0000000.598164
Ranking 500.18898342710.774245-0.9994641.0000000.599456
Total-0.274854534750.733884-0.9998941.0000000.538586

./img/MarginalMeansOfBaiduPositiveProbababilityToGoogleScoreStandardForOriginData.jpg

Baidu Chinese sentiment analysis category value compare with different ranking (origin data)

./img/MarginalMeansOfBaiduCategoryFroOriginData.jpg

  • Baidu Chinese sentiment analysis category values are valid.

Chinese sentiment analysis Error Rate

RankingError Rate
Ranking 100.0054829678
Ranking 200.0064267352
Ranking 300.0080636661
Ranking 400.0083048919
Ranking 500.0083584862
  • Total Error Rate: 0.0073140396

Baidu Chinese sentiment analysis Summary

  • Baidu Chinese sentiment analysis positive probability values are valid.
  • Baidu Chinese sentiment analysis category values are valid.

Baidu English sentiment analysis

Baidu English sentiment analysis postitive probability compare with different ranking (based on Google translated data)

RankingMeanValid NStd.deviationTotal NMinimumMaximumVariance
Ranking 100.51752679680.1347110.0050451.0000000.018147
Ranking 200.531020122250.1412140.0372751.0000000.019941
Ranking 300.540824174570.1371740.0144431.0000000.018817
Ranking 400.56778281630.1449710.0518601.0000000.021016
Ranking 500.58905440060.1507370.0866141.0000000.022722

./img/MarginalMeansOfBaiduPositiveProbabilityForGoogleTranslatedData.jpg

  • Baidu English sentiment analysis positive probability values are valid.

Baidu English sentiment analysis positive probability tranform to Google Score Standard (based on Google translated data)

RankingMeanValid NStd.deviationTotal NMinimumMaximumVariance
Ranking 100.11202979680.586700-0.9949551.0000000.344216
Ranking 200.150325122250.587147-0.9627251.0000000.344742
Ranking 300.193858174570.577416-0.9855571.0000000.333410
Ranking 400.27683581630.564339-0.9481401.0000000.318479
Ranking 500.35240940060.537822-0.9133861.0000000.289253

./img/MarginalMeansOfBaiduPositiveProbabilityToGoogleStandardFroGoogleTranslatedData.jpg

Baidu English sentiment analysis category values compare with different ranking (based on Google translated data)

./img/MarginalMeansOfBaiduCategoryFroGoogleTranslatedData.jpg

  • Baidu English sentiment analysis category values are valid.

Baidu Chinese sentiment analysis positive probability tranform to Google Score standard Method

./img/baiduPositiveProbabilityTranformToGoogleScoreStandard.png

Google sentiment analysis

Google Chinese sentiment analysis

Google Chinese sentiment analysis scores compare with different ranking (origin data)

RankingMeanValid Nstd.deviationTotal NMinimumMaximum
Ranking 10-0.23874285670.4453848572-0.9000000.900000
Ranking 20-0.118380132100.44806413226-0.9000000.900000
Ranking 300.117291189400.46209518974-0.9000000.900000
Ranking 400.31591587780.4581288790-0.9000000.900000
Ranking 500.36162643050.4413094307-0.9000000.900000

./img/MarginalMeansOfGoogleScoreForOriginData.jpg

  • Google Chinese sentiment analysis score values are valid.

Google Chinese sentiment analysis Error Rate

RankingError Rate
Ranking 100.0005832944
Ranking 200.0012097384
Ranking 300.0017919258
Ranking 400.0013651877
Ranking 500.0004643603
  • Total Error Rate: 0.0012808851

Google English sentiment analysis

Google English sentiment analysis score compare with different ranking (based on Google translated data)

RankingMeanValid NStd.deviationTotal NMinimumMaximumVariance
Ranking 10-0.33843185660.430581-0.9000000.9000000.185400
Ranking 20-0.244312132040.437549-0.9000000.9000000.191449
Ranking 30-0.057978189400.447353-0.9000000.9000000.200125
Ranking 400.14783087770.455342-0.9000000.9000000.207336
Ranking 500.22500043040.453471-0.9000000.9000000.205636

./img/MarginalMeansOfGoogleScoreFroGoogleTranslatedData.jpg

  • Google English sentiment analysis score values are valid based on Google translated data.

Google English sentiment analysis score compare with different ranking (base on Yandex translated data)

RankingMeanValid NStd.deviationTotal NMinimumMaximumVariance
Ranking 10-0.33787385680.416416-0.9000000.9000000.173403
Ranking 20-0.23337113221.0000000.422133-0.9000000.9000000.178196
Ranking 30-0.05570318972.0000000.429758-0.9000000.9000000.184692
Ranking 400.1389178788.0000000.447876-0.9000000.9000000.200593
Ranking 500.2082684306.0000000.449598-0.9000000.9000000.202138
  • Google English sentiment analysis score values are valid based on Yandex translated data.

Google English sentiment analysis score compare with different ranking (base on Baidu translated data)

RankingMeanValid NStd.deviationTotal NMinimumMaximumVariance
Ranking 10-0.2849848491.0000000.416185-0.9000000.9000000.173210
Ranking 20-0.19206413092.0000000.417855-0.9000000.9000000.174603
Ranking 30-0.01712518820.0000000.429167-0.9000000.9000000.184185
Ranking 400.1676678734.0000000.432601-0.9000000.9000000.187144
Ranking 500.2446574286.0000000.430004-0.9000000.9000000.184904
  • Google English sentiment analysis score values are valid based on Baidu translated data.

Correlations Between Origin data, Google Translated data, Yandex Translated and Baidu Translated data (each element)

./img/correlationsBetweenOriginGoogleTranslatedYandexTranslatedBaiduTranslatedUsingGoogleSentiment.png

  • assumption Google English sentiment analysis tool and Google Chinese sentiment analysis tool are same
    • Google translation sentence quality > Yandex translation sentence quality > baidu translation sentence quality
    • analysis same langeuage corrlations always bigger than cross langeuage corrlations

Correlations between origin data Mean, Google translated data Mean, Yandex translated Mean and baidu translated data Mean

./img/correlationsBetweenOriginGoogleTranslatedYandexTranslatedBaiduTranslatedMeanUsingGoogleSentiment.png

  • translation sentence tools’ quality have NOT significant impact sentiment analysis results because same data use different translation tool to analysis and all three results have highest correlations between each other.
  • I guess translation key word quality more importance compare with sentence translation quality
  • Using sentiment analysis results compare different translation tools’ quality are NOT reliable.

Baidu sentiment analysis VS Google sentiment analysis

Baidu Chinese sentiment analysis VS Google Chinese sentiment analysis

Mean Value Correlation

  • Pearson Correlation 0.991
  • sig. 0.001
  • N 5
  • Conclusion Baidu Chinese sentiment analysis and Google Chinese sentiment analysis have higher liner relationship.

Error Rate

  • Baidu Chinese sentiment analysis Total Error Rate = 0.0073140396
  • Google Chinese sentiment analysis Total Error Rate = 0.0012808851
  • conclusion
    • Baidu sentiment analysis error rate high than Google sentiment analysis error rate

Tendency

  • chinese sentiment analysis results given by both Baidu and Google are valid because when the ranking group ID increases from 11 to 50, the sentiment analysis score also strictly increases accordingly.

Baidu English sentiment analysis VS Google English sentiment analysis

Mean Value Correlation (based on Google translation)

  • Pearson Correlation 0.978
  • sig. 0.004
  • N 5
  • Conclusion Baidu English sentiment analysis and Google English sentiment analysis have higher liner relationship.

Tendency

  • English sentiment analysis results given by both Baidu and Google are valid because when the ranking group ID increases from 11 to 50, the sentiment analysis score also strictly increases accordingly.

crosslanguagesentimentanalysis's People

Contributors

yanboyang713 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.