mi2datalab / memr Goto Github PK

View Code? Open in Web Editor NEW

17.0 10.0 3.0 3.05 MB

R package for Multisource Embeddings for Medical Records

Home Page: https://mi2datalab.github.io/memr/

License: Other

R 97.57% TeX 2.43%

embeddings medical-records rstats

memr's People

Contributors

Stargazers

Watchers

Forkers

karthik minghao2016 abdullahdmc

memr's Issues

JOSS paper issue

re: openjournals/joss-reviews#2482

State of the field: Do the authors describe how this software compares to other commonly-used packages?

What other packages have similar functionality to this one? Without memr, what would a researcher or doctor use to perform similar analysis on their data? What does memr add over someone using text2vec and performing cluster or PCA analysis themselves?

compute embeddings error

The following code from README.Rmd results in errors when executed in RStudio Version 1.4.1106 with R version 4.0.4 (64-bit):

embedding_size <- 5

interview_term_vectors <- embed_terms(merged_terms = interviews, embedding_size = embedding_size,

                                   term_count_min = 1L)

Error in initialize(...) :
unused arguments (word_vectors_size = 5, vocabulary = list(c("fever", "rhinitis", "cough", "eye", "thyroid"), c(3, 3, 4, 4, 6), c(3, 3, 4, 4, 6)))

examination_term_vectors <- embed_terms(merged_terms = examinations, embedding_size = embedding_size,

                                     term_count_min = 1L)

Error in initialize(...) :
unused arguments (word_vectors_size = 5, vocabulary = list(c("fever", "man", "mother", "cough", "heart", "patient", "thyroid", "eye", "rhinitis", "woman", "father"), c(2, 2, 2, 3, 3, 3, 3, 4, 5, 6, 7), c(2, 2, 2, 3, 3, 3, 3, 4, 5, 6, 7)))

embedding_size <- 5

interview_term_vectors <- embed_terms(merged_terms = interviews, embedding_size = embedding_size,

                                   term_count_min = 1L)

Error in initialize(...) :
unused arguments (word_vectors_size = 5, vocabulary = list(c("fever", "rhinitis", "cough", "eye", "thyroid"), c(3, 3, 4, 4, 6), c(3, 3, 4, 4, 6)))

Package installation instructions

Consider recommending remotes::install_git() instead of devtools::install_git() as the remotes package has fewer dependencies than devtools and is less likely to cause installation issues for people.

Also consider submitting the package to CRAN or bioconductor to make discovery and installation easier for more R users.

Re: openjournals/joss-reviews#2482

Add vignette/more documentation on data inputs

For this package to be useful for other researchers and to serve a purpose beyond capturing the method and code used for https://arxiv.org/pdf/1907.04152.pdf, it needs a vignette and more extensive documentation.

After reading the JOSS paper, the readme here, and the documentation, I'm not clear on how a researcher or doctor would start to use this package.

The readme references "medical free-text records written by doctors" but the example data sets are highly distilled and contain just a few terms. Given the description both here and in the arxiv paper, I expected a sample dataset that approximates the structure of the "dataset of free-text clinical records" referenced. I then expected to see documentation and examples of how a user of the package would be expected to transform this raw data (or really their own similar data) into the distilled inputs expected by the functions of this package.

From https://arxiv.org/pdf/1907.04152.pdf, it seems that memr is not focused on this data processing. If this is correct, I'd suggest 1) editing the description of the package to reflect what type of data it can be used with, and 2) more documentation on what the structure of the data inputs to the functions are expected to contain and what the characteristics of the data should be (e.g. should terms be lowercase? certain parts of speech?). memr does not necessarily need to have all of the functionality to process medical free text records into the format the package needs (although that would be helpful), but potential users need to know what type of data inputs they need to create. The sample data sets and vectors are insufficient to determine this.

Re: openjournals/joss-reviews#2482

Contributing, issue submission, and help guidelines

re: openjournals/joss-reviews#2482

memr needs:

Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

JOSS paper citations

Re: re: openjournals/joss-reviews#2482

The JOSS paper should cite the other R packages you use. It currently only cites text2vec.

mi2datalab / memr Goto Github PK

memr's People

Contributors

Stargazers

Watchers

Forkers

memr's Issues

JOSS paper issue

compute embeddings error

Package installation instructions

Add vignette/more documentation on data inputs

Contributing, issue submission, and help guidelines

JOSS paper citations

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent