extracts p values from .pdf .doc or .docx files in a folder/subfolder
-
Put make_pcurve.py in a folder
-
In a subfolder / subfolders place document files containing reports of p values (these can be .doc .docx or .pdf files)
-
Install python 3 and python-docx and textract
See: https://python-docx.readthedocs.io/en/latest/user/install.html http://textract.readthedocs.io/en/stable/installation.html
if you are using pip then
pip3 install python-docx
pip3 install textract
should work
- run make_pcurve.py
e.g. from the command line
python3 make_pcurve.py
- Enjoy the output and hopefully the data will be saved in a CSV file scraped_pvalues.csv