cbail / textnets Goto Github PK

View Code? Open in Web Editor NEW

211.0 211.0 62.0 7.55 MB

R package to perform automated text analysis using network techniques

License: MIT License

R 100.00%

textnets's People

Contributors

Stargazers

Watchers

textnets's Issues

Cells of the adjacency matrix

Dear Professor Bail,

I hope you are doing well! I'm currently trying to do some further analyses using the adjancy matrix created by Textnets. So, I would like to know how the values of the cells are calculated conceptually. It says in the demo that "the cells of the adjacency matrix are the transposed crossproduce of the term-frequency inverse-document frequency (TFIDF) for overlapping terms between two documents for PrepText." Could you please elaborate on this a bit more conceptually and mathematically?

Thank you for your help in advance!
Yue

textnets manual

Hi,
Is there a manual or a vignette for the textnets R package please?
Are the help files of the package functions available, please?
Many thanks

No PrepTextSent() function

From README.md:

sotu_firsts_sentiment <- PrepTextSent(sotu_firsts, groupvar = "president", textvar = "sotu_text", node_type = "groups",
tokenizer = "words", sentiment_lexicon = "afinn", language = "english", udmodel_lang = udmodel_lang, remove_numbers = NULL, compound_nouns = TRUE)

Error:
Error in PrepTextSent(sotu_firsts, groupvar = "president", textvar = "sotu_text", :
could not find function "PrepTextSent"

"lang" not defined in signednets

I'm just debugging our current signednets.R file in the development.

Line 84 throws an error "Error in cnlp_init_udpipe(lang) : object 'lang' not found"

I assume we want to first assign

lang = "english"

Unable to Install

Hello,

Thank you for a great package.

Unfortunately, I failed to install it.

I received the following error message from R version 3.6.0:

Error: Failed to install 'textnets' from GitHub:
(converted from warning) cannot remove prior installation of package ‘backports’

When I tried remotes::install_github("cbail/textnets") or from R version 4.0.1, I get the following error message:

Error: Failed to install 'textnets' from GitHub:
(converted from warning) installation of package ‘C:/Users/talim/AppData/Local/Temp/RtmpYlFzPY/file10605dbb36ec/textnets_0.1.1.tar.gz’ had non-zero exit status

Error with CreateTextNet function

Whenever I try to use the CreateTextNet function, I get an error message:

> data("sotu")
> sotu_first_speeches <- sotu %>% group_by(president) %>% slice(1L)
> sotu_firsts_network <- CreateTextnet(sotu_first_speeches)
Error in tapply(n, documents, sum) : arguments must have same length

Any ideas? Thanks!

Please report the issue at <https://github.com/cbail/textnets/issues>

QQ: I have tried to run the Readme code on two samples of my data set. However, I received a number of warnings about deprecated features. I also received two error codes (Below). I did not receive the same when I used the sample data. The instructions provided in R says to report the issue to this site.

cnn_firsts_nouns <- PrepText(cnn_firsts, groupvar = "command", textvar = "statement_text", node_type = "groups", tokenizer = "words", pos = "nouns", remove_stop_words = TRUE, compound_nouns = TRUE)
Downloading udpipe model from https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.5/master/inst/udpipe-ud-2.5-191206/english-ewt-ud-2.5-191206.udpipe to /Users/___________english-ewt-ud-2.5-191206.udpipe

This model has been trained on version 2.5 of data from https://universaldependencies.org
The model is distributed under the CC-BY-SA-NC license: https://creativecommons.org/licenses/by-nc-sa/4.0
Visit https://github.com/jwijffels/udpipe.models.ud.2.5 for model license details.
For a list of all models and their licenses (most models you can download with this package have either a CC-BY-SA or a CC-BY-SA-NC license) read the documentation at ?udpipe_download_model. For building your own models: visit the documentation by typing vignette('udpipe-train', package = 'udpipe')
trying URL 'https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.5/master/inst/udpipe-ud-2.5-191206/english-ewt-ud-2.5-191206.udpipe'
Content type 'application/octet-stream' length 16309608 bytes (15.6 MB)
==================================================
downloaded 15.6 MB
Downloading finished, model stored at '/Users/____________/english-ewt-ud-2.5-191206.udpipe'
Warning message:
group_by_() was deprecated in dplyr 0.7.0.
ℹ Please use group_by() instead.
ℹ See vignette('programming') for more help
ℹ The deprecated feature was likely used in the textnets package.
Please report the issue at https://github.com/cbail/textnets/issues.
This warning is displayed once every 8 hours.
Call lifecycle::last_lifecycle_warnings() to see where this warning was
generated.

cnn_firsts_network <- CreateTextnet(cnn_firsts_nouns)

VisTextNet(cnn_firsts_network, label_degree_cut = 0)
Error in mutate():
ℹ In argument: alpha_in = (1 - (weight/w_in))^(k_in - 1).
Caused by error:
! object 'weight' not found
Run rlang::last_trace() to see where the error occurred.

View(unsc_firsts_network)
VisTextNet(unsc_firsts_network, label_degree_cut = 0)
Error in graph_to_tree():
! graph must be directed
Run rlang::last_trace() to see where the error occurred.

Error occurs when running VisTextNet

Dear Prof. Bail,

Thanks for sharing with us a great package and your detailed guidance on the R codes!

For my research, I am analyzing around 2,000 journal article abstracts, and repeatedly running into an error message "Error in if (!is.na(weights)) { : argument is of length zero" when trying to visualize it with the command "VisTextNet". (I finished protecting with the node type being "words"). Not any other option changes - e.g., alpha value or label_degree_cut - do not work, and VisTextNetD3 also does not work either.

Further, for the community detection, though it detects communities, it fails to interpret the modularity class with the command "top_words_modularity_classes" (0 observations are captured through this command).

Any relevant information would be of great help. If you need any additional information, please leave comments in this thread.

Issue with unnest_tokens inside PrepText

When running PrepText, I am getting the error:

"Error in check_input(x) :
Input must be a character vector of any length or a list of character
vectors, each of which has a length of 1."

This seems to be an issue with unnest_tokens inside the function, similar to the problem reported here.

I am able to use unnest_tokens outside PrepText just fine on the same data frame.

Install on R 3.5 Ubuntu 18

Hi, I am getting an installation issue (R 3.5. Ubuntu 18)
library(devtools)

install_github("cbail/textnets")
Downloading GitHub repo cbail/textnets@master
from URL https://api.github.com/repos/cbail/textnets/zipball/master
Installation failed: error in running command

Thanks

Does not seem to work

Tried very hard to make this work, but

Can not download from github
copied file individually - does not work
After hours and fixing here and there
This is the error I get

Error in check_input(x) :
Input must be a character vector of any length or a list of character
vectors, each of which has a length of 1.
In addition: Warning message:
'unnest_tokens_' is deprecated.
Use 'unnest_tokens' instead.
See help("Deprecated")

Add warning or default for PrepText node_type

Currently, PrepText does run even if node_type is not specified, producing this warning:

   Warning messages:
   1: In if (node_type == "groups") { :
     the condition has length > 1 and only the first element will be used
   2: In if (node_type == "words") { :
     the condition has length > 1 and only the first element will be used

It would be more useful for these cases to (1) either break the function and have an informative warning like "node_type needs to be specified" or (2) alternatively set a default.

Error in lazy loading

Hi,
your package looks superinteresting but I get the following error while installing the package:

Installing package into ‘D:/Documenti/R/win-library/3.6’
(as ‘lib’ is unspecified)
* installing *source* package 'textnets' ...
** using staged installation
** R
** data
** byte-compile and prepare package for lazy loading
Error: (converted from warning) package 'dplyr' was built under R version 3.6.3
Execution halted
ERROR: lazy loading failed for package 'textnets'
* removing 'D:/Documenti/R/win-library/3.6/textnets'
Error: Failed to install 'textnets' from GitHub:
  (converted from warning) installation of package ‘C:/Users/sciur/AppData/Local/Temp/RtmpwxGSOm/file33c842677ac4/textnets_0.1.1.tar.gz’ had non-zero exit status

How could it be solved?

Thanks in advance!

Textnets on Chinese text

Greetings.
I tried to use textnets to map Chinese materials. Since Chinese contain meanings in n-grams, I already segmentize my data using other packages first.
However, it seems that Preptext will ignore the segmentation and cut my n-grams into single words. Can I simply import data without any processing?
Furthermore, I would like to export the textnets into gephi for further adjustment. Will it be possible?

Thanks.

dplyr 0.7.0 warning

When I run:

sotu_firsts_nouns <- PrepText(sotu_firsts, groupvar = "president", textvar = "sotu_text", node_type = "groups", tokenizer = "words", pos = "nouns", remove_stop_words = TRUE, compound_nouns = TRUE)

Warning message:
group_by_() is deprecated as of dplyr 0.7.0.
Please use group_by() instead.
See vignette('programming') for more help
This warning is displayed once every 8 hours.
Call lifecycle::last_warnings() to see where this warning was generated.

Using R Version 3.6.3 (OS Mohave).

InterpretText function generates 0 observations

Hi- I was using InterpretText function to see the driving words in each cluster. My node_type = "words". However, it generates 0 observations even though my text_network and tidytextobject both have observations. It didn't give me error messages, so I assume the code is correct. What could be the issue here? Where should I check? Thank you for your help in advance!

Can't replicate

Hi,
I was trying to set some text networks with my data but received some error.
Then, when I tried to replicate the code on package's github page. I received the same error.
I am on R v3.6, and made a clean install on every pack needed for textnetworks.
The error I receive for both cases (for my own data and for sotu data) is as follows

library(textnets)
Loading required package: dplyr

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Loading required package: udpipe
Loading required package: ggraph
Loading required package: ggplot2
Loading required package: networkD3
Warning messages:
1: replacing previous import ‘dplyr::union’ by ‘igraph::union’ when loading ‘textnets’
2: replacing previous import ‘dplyr::as_data_frame’ by ‘igraph::as_data_frame’ when loading ‘textnets’
3: replacing previous import ‘dplyr::groups’ by ‘igraph::groups’ when loading ‘textnets’

data("sotu")
sotu_firsts <- sotu %>% group_by(president) %>% slice(1L)
sotu_firsts_nouns <- PrepText(sotu_firsts, groupvar = "president", textvar = "sotu_text", node_type = "groups", tokenizer = "words", pos = "nouns", remove_stop_words = TRUE, compound_nouns = TRUE)
Downloading udpipe model from https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.4/master/inst/udpipe-ud-2.4-190531/english-ewt-ud-2.4-190531.udpipe to C:/Users/.../english-ewt-ud-2.4-190531.udpipe
Visit https://github.com/jwijffels/udpipe.models.ud.2.4 for model license details
trying URL 'https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.4/master/inst/udpipe-ud-2.4-190531/english-ewt-ud-2.4-190531.udpipe'
Content type 'application/octet-stream' length 16477964 bytes (15.7 MB)
downloaded 15.7 MB

my input is a character vector....

Any help would be greatly appreciated.
best
E.

clustering in TextCommunities()

Noticed that community detection in TextCommunities() and VisTextNet() functions is slightly different. From inspecting the code, looks like communities<-cluster_louvain(text_network) in the TextCommunities() function should be communities<-cluster_louvain(pruned). Love this library. Thanks for making it.

Look into sigmaNet for VisualizeText

The author of the udpipe R package referenced this package for network visualizations: https://github.com/iankloo/sigmaNet

It claims to be suited to quickly render large networks as well as provide interactive features, so might be interesting to look into.

unable to install -- dplyr was built under R version 4.0.3

hello:

i'm unable to install the package. i'm getting the sense that the issue is with dplyr, but i'm not sure what to do...

installing source package 'textnets' ...
** using staged installation
** R
** data
** byte-compile and prepare package for lazy loading
Error: (converted from warning) package 'dplyr' was built under R version 4.0.3
Execution halted
ERROR: lazy loading failed for package 'textnets'

removing '.../win-library/4.0/textnets'
Error: Failed to install 'textnets' from GitHub:

UPDATE: I just need a more recent version of R.

Error with PrepText function using example sotu data

I’m getting an “Check_input” error when I try to run PrepText using the sotu example, even though the text is type character. I tried creating my own tf-idf data frame using tidy text so I could still use the visualization functions in this package but I wasn’t sure what the outputs of PrepText and CreateTextnet look like to troubleshoot. Thanks for your help!

Textnets not available for R version 3.6.1

Hi Chris,

You're textnets tutorial was very clear and, fortunately, exactly what I needed to explore my own dataset. However, I'm encountering a problem. When I try to install textnets, I get the follow message:

package ‘cbail/textnets’ is not available (for R version 3.6.1)

This has been true one one computer running different versions of R (inc. 3.4), and also true on two other computers that were all version 3.6. I've downloaded textnets and tried to install it from a local directory rather than a grab.

I'm afraid I'm not particularly savvy in this regard, but is there a version textnets is available for? Has something happened at the github end?

Thanks in advance,

rohan

Is the function still being developed so that we cannot use it currently?

Thank you for your help!

VisTextNet() won't run when node_type = "words"

Running

library(textnets)
data(sotu)
sotu_first_speeches <- sotu %>% 
  group_by(president) %>% 
  slice(1L)
prepped_sotu <- PrepText(sotu_first_speeches, 
                         groupvar = "president", 
                         textvar = "sotu_text", 
                         node_type = "groups", 
                         tokenizer = "words", 
                         pos = "nouns", 
                         remove_stop_words = TRUE,
                         compound_nouns = TRUE)
sotu_text_network <- CreateTextnet(prepped_sotu)
VisTextNet(sotu_text_network, alpha=.1, label_degree_cut = 3)

works fine.

However, when I try and run the word-to-word network, I get an error.

library(textnets)
data(sotu)
sotu_first_speeches <- sotu %>% 
  group_by(president) %>% 
  slice(1L)
prepped_sotu <- PrepText(sotu_first_speeches, 
                         groupvar = "president", 
                         textvar = "sotu_text", 
                         node_type = "words", 
                         tokenizer = "words", 
                         pos = "nouns", 
                         remove_stop_words = TRUE,
                         compound_nouns = TRUE)
sotu_text_network <- CreateTextnet(prepped_sotu)
VisTextNet(sotu_text_network)

I get the error

Using 'sparse_stress' with 250 pivots as default layout
Error in if (!is.na(weights)) { : argument is of length zero

cbail / textnets Goto Github PK

textnets's People

Contributors

Stargazers

Watchers

Forkers

textnets's Issues

Recommend Projects

Recommend Topics

Recommend Org