The scripts generate random data using uniform distribution therefore the data cannot be used for clinical research and doesn't reflect real disease distribution across the population.
The CRF json is based on CanCOGen CRF excel.
Requirements:
Python 3.7
To install dependencies run:
pip install -r requirements.txt
Usage:
- Generate a list of CanCOGen CRFs in json:
python generator.py [--number_of_patients=100 --filename=output]
- Convert a list of CanCOGen CRFs to a list of Phenopackets
python converter.py --input_file=cancogen_crf.json [--output_filename=output]