Comments (5)
Apologies! I misread your code. The category
parameter should be 'positive'
. The category_name
is what the category will be rendered on the plot.
from scattertext.
It's hard to tell exactly what's going on without seeing the contents of data
which is presumably a pandas.DataFrame.
Your category_name
(currently 'Positive') parameter must in a value in the column 'sentiment' in data
. The error is saying that isn't. The same thing holds for Negative
.
Also, the metadata
parameter should be a array-like object that's the same length as data, and holds the titles for each of the documents shown.
from scattertext.
The Pandas DataFrame has 3 columns - reviews, stars and sentiment, where sentiment is a binary categorical variable (positive or negative). Reviews are strings and stars can range from 1-5.
I changed my code to the one below (seeing as metadata is an optional param I dropped it):
html = produce_scattertext_explorer(corpus,
category='sentiment',
category_name='positive',
not_category_name='negative',
width_in_pixels=1000,
minimum_term_frequency=5,
term_significance = st.LogOddsRatioUninformativeDirichletPrior(),
include_term_category_counts=False)
file_name = 'test.html'
open(file_name, 'wb').write(html.encode('utf-8'))
IFrame(src=file_name, width = 1200, height=700)
Getting this error:
AssertionError Traceback (most recent call last)
in ()
7 term_significance = st.LogOddsRatioUninformativeDirichletPrior(),
8 metadata = np.array([x for x in data['stars']]),
----> 9 include_term_category_counts=False)
10 file_name = 'test.html'
11 open(file_name, 'wb').write(html.encode('utf-8'))
~/anaconda3/lib/python3.6/site-packages/scattertext/init.py in produce_scattertext_explorer(corpus, category, category_name, not_category_name, protocol, pmi_threshold_coefficient, minimum_term_frequency, minimum_not_category_term_frequency, max_terms, filter_unigrams, height_in_pixels, width_in_pixels, max_snippets, max_docs_per_category, metadata, scores, x_coords, y_coords, original_x, original_y, rescale_x, rescale_y, singleScoreMode, sort_by_dist, reverse_sort_scores_for_not_category, use_full_doc, transform, jitter, gray_zero_scores, term_ranker, asian_mode, use_non_text_features, show_top_terms, show_characteristic, word_vec_use_p_vals, max_p_val, p_value_colors, term_significance, save_svg_button, x_label, y_label, d3_url, d3_scale_chromatic_url, pmi_filter_thresold, alternative_text_field, terms_to_include, semiotic_square, num_terms_semiotic_square, not_categories, neutral_categories, extra_categories, show_neutral, neutral_category_name, get_tooltip_content, x_axis_values, y_axis_values, color_func, term_scorer, show_axes, horizontal_line_y_position, vertical_line_x_position, show_cross_axes, show_extra, extra_category_name, censor_points, center_label_over_points, x_axis_labels, y_axis_labels, topic_model_term_lists, topic_model_preview_size, metadata_descriptions, vertical_lines, characteristic_scorer, term_colors, unified_context, show_category_headings, include_term_category_counts, div_name, alternative_term_func, return_data)
446 extra_categories=extra_categories,
447 background_scorer=characteristic_scorer,
--> 448 include_term_category_counts=include_term_category_counts)
449 if return_data:
450 return scatter_chart_data
~/anaconda3/lib/python3.6/site-packages/scattertext/ScatterChartExplorer.py in to_dict(self, category, category_name, not_category_name, scores, metadata, max_docs_per_category, transform, alternative_text_field, title_case_names, not_categories, neutral_categories, extra_categories, neutral_category_name, extra_category_name, background_scorer, include_term_category_counts)
108 neutral_categories=neutral_categories,
109 extra_categories=extra_categories,
--> 110 background_scorer=background_scorer)
111 docs_getter = self._make_docs_getter(max_docs_per_category, alternative_text_field)
112 if neutral_category_name is None:
~/anaconda3/lib/python3.6/site-packages/scattertext/ScatterChart.py in to_dict(self, category, category_name, not_category_name, scores, transform, title_case_names, not_categories, neutral_categories, extra_categories, background_scorer)
266
267 all_categories = self.term_doc_matrix.get_categories()
--> 268 assert category in all_categories
269
270 if not_categories is None:
AssertionError:
Thank you for helping me out!
from scattertext.
scattertext_use.pdf
Here is an attached pdf of my code and output.
from scattertext.
Works perfectly! Thanks!
from scattertext.
Related Issues (20)
- AttributeError in plotting scattertext HOT 1
- saving to HTML breaks encoding HOT 2
- Is it possible to remove stopwords without Spacy? HOT 1
- How to have the Scattertext without showing collocation? HOT 1
- How to list the words according to the color or the range of the scores? HOT 1
- Corpus size too large HOT 1
- Lack of words in one category HOT 1
- How to make points in the chart bigger? HOT 3
- Stopwords not working HOT 1
- How can I use custom stopword list
- Scattertext 0.1.8 requires spacy HOT 1
- Visualization is not grouping documents by metadata value. HOT 1
- Simple Example uses non-existent PMI argument HOT 2
- scattertext breaks with scikit-learn 1.2.0 HOT 1
- Error while trying to install scattertext in Google Colab via pip/pip3 HOT 3
- Labeling Issue during visualization of Scaled F-Score HOT 2
- good colab runtime stops when trying to upload own video
- Suggestion for improving UX - positioning of elements HOT 1
- Does ScatterText somehow combine tokens? HOT 5
- Textrank example fails: 'numpy.ndarray' object has no attribute 'loc'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scattertext.