Giter Club home page Giter Club logo

consensus's Introduction

Error in user YAML: (<unknown>): did not find expected comment or line break while scanning a block scalar at line 3 column 3
---
title: "Project Consensus"
Institution: 
  | University of Chicago
author: 
  | M.A. Camacho Jonathan 
  | Dr. Valentin Danchev
---

We seek to understand how the use of p-value evolved historically. Using Sociological theories of institutionalization and collective attention, and Computational methods, we hypothesized that the institutionalization of p-values led to a decrease on the level of its specification in research articles, signifying that p-values are taken-for-granted. We expect to contribute to debates about the misuse of p-values in academic fields such as life sciences, biomedical sciences, and psychology.

We conduct a data mining by managing, cleaning, and transforming data from 300,000+ public health articles from JSTOR and Europe PMC, and mined p-values from the text by querying a relational database of 1,200,000+ entries using R, Python, and SQL.

Files

All the files are contained in the scripts folder.

  1. database.connection.ipynb: Develops a function in python and MySQL that queries a relational database of 1,200,000+ entries at the Knowledge Lab - University of Chicago. The function establishes the connection to the database using helper functions located in the function.py file, and uses SQL DML language to select features or variables from multiple tables, performs a table merge, and creates a dataset in CSV format for further analysis.

  2. functions.py: Helper functions for database querying interface.
    File containing a series of helper functions coded in Python to aid in the analysis of p-values included in text files.

  • read_db_config: Read database configuration file and return a dictionary object.
  • iter_row: Read database configuration file and return a dictionary object.
  • extract_p_values: (Prototype) A function to extract p values from a string of text.
  1. GetPvaluesEuropePMC_jonathan.r This is a script in R that interfaces the Europe PMC API and extracts p-values from 170,000+ files from Europe PMC using a list of terms to search in text extracted from a dataframe. Then it saves a R.data file with the pmcids.

  2. clean_journals.R This script in R takes to datasets containing a full list of articles in jstor and PMC Europe publications and cleans features and merges the datasets into one master dataset.

  3. config.ini Database access credentials. It is not included in the repo for security reasons.

consensus's People

Contributors

jonathanecm avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.