Giter Club home page Giter Club logo

fiddle-experiments's People

Contributors

shengpu-tang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

fiddle-experiments's Issues

Bugs in mimic3_experiments

Hi Shengpu,

I have summarized some bugs in the mimic3_experiments directory. You may check them while available.

1_data_extraction

extract_data.py

Exceptions:

  1. Line 251: pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Result is too large for pandas.Timedelta. Convert inputs to datetime.datetime with 'Timestamp.to_pydatetime()' before subtracting.

Suggestions:

  1. Replace x.INTIME with x.INTIME.to_pydatetime().

LabelDistributions.ipynb

Exceptions:

  1. Line 44: FileNotFoundError
  2. Line 54: pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Result is too large for pandas.Timedelta. Convert inputs to datetime.datetime with 'Timestamp.to_pydatetime()' before subtracting.

Suggestions:

  1. Replace open('config.yaml') with open('../config.yaml')
  2. Replace x.INTIME with x.INTIME.to_pydatetime()

InclusionExclusion.ipynb

Exceptions:

  1. Line 29: pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Result is too large for pandas.Timedelta. Convert inputs to datetime.datetime with 'Timestamp.to_pydatetime()' before subtracting.

Suggestions:

  1. Replace x.INTIME with x.INTIME.to_pydatetime()

PopulationSummary.ipynb

Exceptions:

  1. Line 24: KeyError
  2. Line 26: FileNotFoundError
  3. Line 68: pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Result is too large for pandas.Timedelta. Convert inputs to datetime.datetime with 'Timestamp.to_pydatetime()' before subtracting.

Suggestions:

  1. Replace set_index('ICUSTAY_ID') with set_index('D')
  2. The file pop.mortality_benchmark.csv is not exist
  3. Replace x.INTIME with x.INTIME.to_pydatetime()

2_apply_FIDDLE

Suggestion: I think it's better to include FIDDLE module in this directory. After that, there are some other bugs.

README.md

Exceptions:

  1. Line 41: FileNotFoundError

Suggestion:

  1. There is no file named make_features.py

run_make_all.sh

exceptions:

  1. output_dir is required
  2. FileNotFoundError

Suggestion:

  1. You should set the output_dir for each run, since it's required in run.py

  2. Since the dir features/outcome=mortality,T=48.0,dt=1.0 is replaced by features/benckmark,outcome=mortality,T=48.0,dt=1.0 in 1_data_extraction/run_prepare_all.sh, this script is not able to run:

    OUTCOME=mortality
    T=48.0
    dt=1.0
    python run.py \
        --data_fname="$DATAPATH/features/outcome=$OUTCOME,T=$T,dt=$dt/input_data.p" \

    Since the file pop.mortality_benchmark.csv is not exist, this script is not able to run:

    python run.py \
        --data_fname="$DATAPATH/features/benchmark,outcome=mortality,T=48.0,dt=1.0/input_data.p" \
        --population="$DATAPATH/population/pop.mortality_benchmark.csv" \

3_ML_models

lib/data.py

Exceptions:

  1. Line 75, 121: FileNotFoundError
  2. Line 123, 124: Directory not exist

Suggestion:

  1. The file pop.mortality_benchmark.csv is not exist
  2. The directory features/outcome=mortality,T=48.0,dt=1.0 is not exist and replaced by features/benckmark,outcome=mortality,T=48.0,dt=1.0

config.yaml

Exceptions:

  1. Line 21: The feature_dimension of ARF 4.0 is not 4143

Suggestion:

  1. Set to 4381

run_deep_eval.py

Exceptions:

  1. Line 57: import error

Suggestion:

  1. Replace from sklearn.externals.joblib import Parallel, delayed with from joblib import Parallel, delayed

Are FIDDLE features comparable across MIMIC and eICU?

I want to run an experiment to assess whether a model trained on MIMIC is able to generalize on eICU. Are the FIDDLE features comparable as it is? If not, is it possible to carve out a subset that is comparable across the datasets?

FIDDLE output format

I tried to reproduce the FIDDLE experiments, however, the output X.npz is not a sparse matrix (and thus won't load using spicy.sparse.load_npz(), so I used lumpy.load()). X.npz contains:

X['data']: a long vector of only 1's
X['shape']: a vector describing the correct dimensions of the expected output tensor
X['fill_value']: a vector with just a single zero in it
X['coords']: a vector with 3 rows and the same number of columns as the length of X['data']

Is this an error or do I need to process this output first in order to get the sparse N x L x D tensor? I did not see anything in the documentation or paper regarding this. Cheers.

Dimension error reproduction eICU experiments

Hi!

I tried to replicate the eICU experiments with the descretize option turned off, but got an error in the FIDDLE code saying "TypeError: bad operand type for unary ~: 'float' ". I adjusted the FIDDLE code and eventually it worked, but then I got a dimension error in the process of training the CNN, where it said that matrix 1 could not multiply with matrix 2 because they were not the right shape.

Do you have any idea in what direction I could go to fix this problem? Thank you so much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.