Giter Club home page Giter Club logo

memr's People

Contributors

adamgdobrakowski avatar karthik avatar pbiecek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

memr's Issues

JOSS paper issue

re: openjournals/joss-reviews#2482

State of the field: Do the authors describe how this software compares to other commonly-used packages?

What other packages have similar functionality to this one? Without memr, what would a researcher or doctor use to perform similar analysis on their data? What does memr add over someone using text2vec and performing cluster or PCA analysis themselves?

compute embeddings error

The following code from README.Rmd results in errors when executed in RStudio Version 1.4.1106 with R version 4.0.4 (64-bit):

embedding_size <- 5

interview_term_vectors <- embed_terms(merged_terms = interviews, embedding_size = embedding_size,

  •                                    term_count_min = 1L)
    

Error in initialize(...) :
unused arguments (word_vectors_size = 5, vocabulary = list(c("fever", "rhinitis", "cough", "eye", "thyroid"), c(3, 3, 4, 4, 6), c(3, 3, 4, 4, 6)))

examination_term_vectors <- embed_terms(merged_terms = examinations, embedding_size = embedding_size,

  •                                      term_count_min = 1L)
    

Error in initialize(...) :
unused arguments (word_vectors_size = 5, vocabulary = list(c("fever", "man", "mother", "cough", "heart", "patient", "thyroid", "eye", "rhinitis", "woman", "father"), c(2, 2, 2, 3, 3, 3, 3, 4, 5, 6, 7), c(2, 2, 2, 3, 3, 3, 3, 4, 5, 6, 7)))

embedding_size <- 5

interview_term_vectors <- embed_terms(merged_terms = interviews, embedding_size = embedding_size,

  •                                    term_count_min = 1L)
    

Error in initialize(...) :
unused arguments (word_vectors_size = 5, vocabulary = list(c("fever", "rhinitis", "cough", "eye", "thyroid"), c(3, 3, 4, 4, 6), c(3, 3, 4, 4, 6)))

Package installation instructions

Consider recommending remotes::install_git() instead of devtools::install_git() as the remotes package has fewer dependencies than devtools and is less likely to cause installation issues for people.

Also consider submitting the package to CRAN or bioconductor to make discovery and installation easier for more R users.

Re: openjournals/joss-reviews#2482

Add vignette/more documentation on data inputs

For this package to be useful for other researchers and to serve a purpose beyond capturing the method and code used for https://arxiv.org/pdf/1907.04152.pdf, it needs a vignette and more extensive documentation.

After reading the JOSS paper, the readme here, and the documentation, I'm not clear on how a researcher or doctor would start to use this package.

The readme references "medical free-text records written by doctors" but the example data sets are highly distilled and contain just a few terms. Given the description both here and in the arxiv paper, I expected a sample dataset that approximates the structure of the "dataset of free-text clinical records" referenced. I then expected to see documentation and examples of how a user of the package would be expected to transform this raw data (or really their own similar data) into the distilled inputs expected by the functions of this package.

From https://arxiv.org/pdf/1907.04152.pdf, it seems that memr is not focused on this data processing. If this is correct, I'd suggest 1) editing the description of the package to reflect what type of data it can be used with, and 2) more documentation on what the structure of the data inputs to the functions are expected to contain and what the characteristics of the data should be (e.g. should terms be lowercase? certain parts of speech?). memr does not necessarily need to have all of the functionality to process medical free text records into the format the package needs (although that would be helpful), but potential users need to know what type of data inputs they need to create. The sample data sets and vectors are insufficient to determine this.

Re: openjournals/joss-reviews#2482

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.