Giter Club home page Giter Club logo

integrated-ehr-pipeline's Introduction

Integrated-EHR-Pipeline

  • Pre-processing code refining project in UniHPF

Install Requirements

  • NOTE: This repository requires python>=3.9 and Java>=8
  • NOTE: Since there is a performance issue related to transformers library, it is recommended to use transformers==4.29.1.
pip install numpy pandas tqdm treelib transformers==4.29.1 pyspark

How to Use

main.py --ehr {eicu, mimiciii, mimiciv}
  • It automatically download the corresponding dataset from physionet, but requires appropriate certification.
  • You can also use the downloaded dataset with --data {data path} option
  • You can check sample implementation of pytorch dataset on sample_dataset.py

integrated-ehr-pipeline's People

Contributors

starmpcc avatar ji-youn-kim avatar jwoo5 avatar

Stargazers

amine mihoubi avatar Zhenbang Wu avatar Daniel Buades Marcos avatar Yoon, Seungje avatar Sujeong Im avatar Hyewon Jeong avatar baeseongsu avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar Sujeong Im avatar

Forkers

eunbyeol-cho

integrated-ehr-pipeline's Issues

Pyspark Timezone Issue

from pyspark.sql import SparkSession
import pandas as pd
spark = (
        SparkSession.builder.master(f"local[8]")
        .appName("Main_Preprocess")
        .getOrCreate()
    )
df = pd.read_csv('/{MIMIC-IV}/icustays.csv.gz', nrows=1000)
df.head()
df['intime'] = pd.to_datetime(df['intime'])
df = spark.createDataFrame(df)
df.show()

Problem

  • Occur when local time zone != JVM (Pyspark) time zone (i.e. spark.conf.get('spark.sql.session.timeZone')!=$ date)

Affect

  • Intime/Outtime... are shifted but Charttime/Labtime... not -> Events are wrongly filtered

Solution (Temporal)

  • spark.conf.set('spark.sql.session.timeZone', 'Asia/Seoul')

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.