Giter Club home page Giter Club logo

metagwastoolkit's Introduction

MetaGWASToolKit

v1.1 "Willibrord"

DOI

A ToolKit to perform a Meta-analysis of Genome-Wide Association Studies (GWAS). Check out the wiki for more details.

MetaGWASToolKit is a set of scripts that executes a fully automated meta-analysis of GWAS. It is an extension of MANTEL, originally developed by Paul I.W. de Bakker, Sara L. Pulit and Jessica van Setten and for which many features were described before and later further extended upon by Winkler T.W. et al.

In a first step, MetaGWASToolKit will automatically parse, harmonize, and clean summary statistics from individual GWAS. In a second step the user will have to inspect each individual GWAS summarizing plot, including Manhattans, QQ-plots, Z-P plots, frequency plots, distribution of effect sizes, etc. In the third and fourth step, the meta-analysis is prepared and subsequently executed. In the fifth step, the results of the meta-analysis can be inspected, as the filtered and annnotated summary statistics and plots are created. Fixed- and random effects, as well as Z-score-based analyses are executed by default. Heterogeneity among cohorts is quantified using the I2 and Q-statistics. When genome-wide significant hits are present, clumping is automatically done, and regional association plots are generated.

The necessary files for post-GWAS analyses, including those for Mendelian randomization and LD Score regression analysis through MR-base and LDHub, respectively. Currently, meta-analyses using 1000G phase 1 and 3, and HRC r1.1 as a reference are supported. Note that MetaGWASToolKit will accept multi-allelic variants coded as bi-allelic variants (each allele-combination written per line/row), however it will adhere to strict rules: only when a variant can be precisely match to the chosen reference will it be valid. Variants that cannot be matched will be analyzed, but flagged. In principle it is possible to make it work for legacy references too, e.g. HapMap2, please raise an issue for support on this.

Future versions

Scripts to execute fine-mapping, create regional (MIAME) plots to compare trait-results and perform formal colocalization analyses, as well as PolarMorphism will be added in future versions.


The MIT License (MIT)

Copyright (c) 2015-2022 Sander W. van der Laan | s.w.vanderlaan [at] gmail [dot] com | vanderlaanand.science.

metagwastoolkit's People

Contributors

moez-baksi avatar mvpuijk avatar swvanderlaan avatar xemz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

metagwastoolkit's Issues

Meta-analysis: param-file generator

Add in params-file generator (cohort-name, lambda [after QC], avg. sample size, beta-correction factor) ๐Ÿšง
If you have many cohorts (10+) or many (sub-)meta-analyses creating the param-file by hand can be a nuisance. We should make a perl/python script (probably fastest) to print out these things in one go...

Clumps are 'cleared'

For some reason parseClumps.pl literally 'greps' the contents of *.clumped-files, so essentially the contents are cleared. It is not really a problem, the contents are parsed and printed to a new file, but ideally the original results should be kept in the input-files (i.e. *.clumped-files).

Meta-analysis: add --verbose function

Add --verbose and other flags of METAGWAS.pl to metagwastoolkit.conf. Currently as a default, the meta-analysis is done in a --verbose mode, i.e. all relevant data of each cohort is added to the final meta-analysis output. This can be troublesome when tens or hundreds of GWAS datasets are analyzed. In the next version this behaviour can be changed by setting the appropriate flag in metagwastoolkit.conf. Note: this script needs fixing. ๐Ÿšง

Array jobs

Is your feature request related to a problem? Please describe.
Currently the main scripts chunk the files and submit jobs for each individual file. This increases the load on the server, and decreases your fair usage quotum.

Describe the solution you'd like
The individual submission scripts should be wrapped in an array job to reduce the load on the server.

Describe alternatives you've considered
An alternative would be not to chunk the files, but this would be inefficient and significantly reduce the amount of time needed to run an analysis.

Additional context
None.

Stratified QQ plots with lambda per bin

Is your feature request related to a problem? Please describe.
To assess the tradeoff between allele frequencies and info-metric it would be great to calculate the lambda per bin too, aside of the number of variants.

Describe the solution you'd like
Add the lambda per bin, next to the number of SNPs per bin. And in addition increase the font size of the legend (specifically the diamonds).

Describe alternatives you've considered
No alternatives were considered, other than running QQ plots separately for pre-calculated bins.

Additional context
See the example script below.

## READS input options
rm(list=ls())
input=commandArgs()[7]
input=substr(input,2,nchar(input))
โ€‹
output=commandArgs()[8]
output=substr(output,2,nchar(output))
โ€‹
## Plot function ##
plotQQ <- function(z,color,cex){
p <- 2*pnorm(-abs(z))
p <- sort(p)
expected <- c(1:length(p))
lobs <- -(log10(p))
lexp <- -(log10(expected / (length(expected)+1)))
โ€‹
# plots all points with p < 1e-3
p_sig = subset(p,p<0.001)
points(lexp[1:length(p_sig)], lobs[1:length(p_sig)], pch=23, cex=.3, col=color, bg=color)
โ€‹
# samples 5000 points from p > 1e-3
n=5001
i<- c(length(p)- c(0,round(log(2:(n-1))/log(n)*length(p))),1)
lobs_bottom=subset(lobs[i],lobs[i] <= 3)
lexp_bottom=lexp[i[1:length(lobs_bottom)]]
points(lexp_bottom, lobs_bottom, pch=23, cex=cex, col=color, bg=color)
}
โ€‹
plotQQ2 <- function(z,color,cex){
p <- 2*pnorm(-abs(z))
p <- sort(p)
expected <- c(1:length(p))
lobs <- -(log10(p))
lexp <- -(log10(expected / (length(expected)+1)))
โ€‹
# plots all points
points(lexp[1:length(p)], lobs[1:length(p)], pch=23, cex=.3, col=color, bg=color)
โ€‹
}
โ€‹
โ€‹
## Reads data
S <- read.table(input,header=T)
z=qnorm(S$P/2)
z_lo00=subset(S, ( S$CAF > 0.99 | S$CAF < 0.01 ))
z_lo01=subset(S, ( S$CAF > 0.20 & S$CAF < 0.80 ))
z_lo02=subset(S, ( S$CAF < 0.20 & S$CAF > 0.05 ) | ( S$CAF > 0.80 & S$CAF < 0.95 ))
z_lo03=subset(S, ( S$CAF < 0.05 & S$CAF > 0.01 ) | ( S$CAF > 0.95 & S$CAF < 0.99 ))
โ€‹
z_lo0=qnorm(z_lo00$P/2)
z_lo1=qnorm(z_lo01$P/2)
z_lo2=qnorm(z_lo02$P/2)
z_lo3=qnorm(z_lo03$P/2)
โ€‹
โ€‹
## calculates lambda
lambda = round(median(z^2)/qchisq(0.5,df=1),3)
l0 = round(median(z_lo0^2)/qchisq(0.5,df=1),3)
l1 = round(median(z_lo1^2)/qchisq(0.5,df=1),3)
l2 = round(median(z_lo2^2)/qchisq(0.5,df=1),3)
l3 = round(median(z_lo3^2)/qchisq(0.5,df=1),3)
โ€‹
## Plots axes and null distribution
pdf(paste(output,"qqplot_maf.pdf",sep="."), width=6, height=6)
plot(c(0,8), c(0,8), col="red", lwd=3, type="l", xlab="Expected Distribution (-log10 of P value)", ylab="Observed Distribution (-log10 of P value)", xlim=c(0,8), ylim=c(0,8), las=1, xaxs="i", yaxs="i", bty="l",main=c(substitute(paste("QQ plot: ",lambda," = ", lam),list(lam = lambda)),expression()))
โ€‹
## plots data
โ€‹
plotQQ(z,"black",0.4);
plotQQ(z_lo1,"olivedrab1",0.3);
plotQQ(z_lo2,"orange",0.3);
plotQQ(z_lo3,"lightskyblue",0.3);
plotQQ(z_lo0,"purple",0.3);
โ€‹
## provides legend
โ€‹
#legend(.25,8,legend=c("Expected (null)","Observed",
#paste("MAF > 0.20 [",length(z_lo1),"]"),
#paste("0.05 < MAF < 0.2 [",length(z_lo2),"]"),
#paste("0.01 < MAF < 0.05 [",length(z_lo3),"]"),
#paste("MAF < 0.01 [",length(z_lo0),"]")),
#pch=c((vector("numeric",6)+1)*23), cex=c((vector("numeric",6)+0.8)), pt.bg=c("red","black","olivedrab1","orange","lightskyblue","purple"))
โ€‹
legend(.25,8,legend=c("Expected (null)","Observed",
substitute(paste("MAF > 0.20 [", lambda," = ", lam, "]"),list(lam = l1)),expression(),
substitute(paste("0.05 < MAF < 0.20 [", lambda," = ", lam, "]"),list(lam = l2)),expression(),
substitute(paste("0.01 MAF < 0.05 [", lambda," = ", lam, "]"),list(lam = l3)),expression(),
substitute(paste("MAF < 0.01 [", lambda," = ", lam, "]"),list(lam = l0)),expression()),
pch=c((vector("numeric",6)+1)*23), cex=c((vector("numeric",6)+0.8)), pt.bg=c("red","black","olivedrab1","orange","lightskyblue","purple"))
โ€‹
rm(z)
dev.off()
โ€‹
โ€‹
## Plot function ##
plotQQ <- function(z,color,cex){
p <- 2*pnorm(-abs(z))
p <- sort(p)
expected <- c(1:length(p))
lobs <- -(log10(p))
lexp <- -(log10(expected / (length(expected)+1)))
โ€‹
# plots all points with p < 1e-3
p_sig = subset(p,p<0.001)
points(lexp[1:length(p_sig)], lobs[1:length(p_sig)], pch=23, cex=.3, col=color, bg=color)
โ€‹
# samples 5000 points from p > 1e-3
n=5001
i<- c(length(p)- c(0,round(log(2:(n-1))/log(n)*length(p))),1)
lobs_bottom=subset(lobs[i],lobs[i] <= 3)
lexp_bottom=lexp[i[1:length(lobs_bottom)]]
points(lexp_bottom, lobs_bottom, pch=23, cex=cex, col=color, bg=color)
}
โ€‹
plotQQ2 <- function(z,color,cex){
p <- 2*pnorm(-abs(z))
p <- sort(p)
expected <- c(1:length(p))
lobs <- -(log10(p))
lexp <- -(log10(expected / (length(expected)+1)))
โ€‹
# plots all points
points(lexp[1:length(p)], lobs[1:length(p)], pch=23, cex=.3, col=color, bg=color)
โ€‹
}
โ€‹
โ€‹
## Reads data
z=qnorm(S$P/2)
z_lo01=subset(S, S$INFO > 0.75)
z_lo02=subset(S, ( S$INFO < 0.75 & S$INFO > 0.5 ) )
z_lo03=subset(S, ( S$INFO < 0.5 & S$INFO > 0.25 ) )
z_lo04=subset(S, ( S$INFO < 0.25 ) )
โ€‹
z_lo4=qnorm(z_lo04$P/2)
z_lo1=qnorm(z_lo01$P/2)
z_lo2=qnorm(z_lo02$P/2)
z_lo3=qnorm(z_lo03$P/2)
โ€‹
โ€‹
## calculates lambda
lambda = round(median(z^2)/qchisq(0.5,df=1),3)
l4 = round(median(z_lo4^2)/qchisq(0.5,df=1),3)
l1 = round(median(z_lo1^2)/qchisq(0.5,df=1),3)
l2 = round(median(z_lo2^2)/qchisq(0.5,df=1),3)
l3 = round(median(z_lo3^2)/qchisq(0.5,df=1),3)
โ€‹
## Plots axes and null distribution
pdf(paste(output,"qqplot_impq.pdf",sep="."), width=6, height=6)
plot(c(0,8), c(0,8), col="red", lwd=3, type="l", xlab="Expected Distribution (-log10 of P value)", ylab="Observed Distribution (-log10 of P value)", xlim=c(0,8), ylim=c(0,8), las=1, xaxs="i", yaxs="i", bty="l",main=c(substitute(paste("QQ plot: ",lambda," = ", lam),list(lam = lambda)),expression()))
โ€‹
## plots data
โ€‹
plotQQ(z,"black",0.4);
plotQQ(z_lo1,"olivedrab",0.3);
plotQQ(z_lo2,"olivedrab1",0.3);
plotQQ(z_lo3,"orange",0.3);
plotQQ(z_lo4,"lightskyblue",0.3);
โ€‹
## provides legend
#legend(.25,8,legend=c("Expected (null)","Observed",
#paste("impq > 0.75 [",length(z_lo1),"]"),
#paste("0.5 < impq < 0.75 [",length(z_lo2),"]"),
#paste("0.25 < impq < 0.5 [",length(z_lo3),"]"),
#paste("impq < 0.25 [",length(z_lo4),"]")), 
#pch=c((vector("numeric",6)+1)*23), cex=c((vector("numeric",6)+0.8)), pt.bg=c("red","black","olivedrab","olivedrab1","orange","lightskyblue"))
legend(.25,8,legend=c("Expected (null)","Observed",
substitute(paste("imp qual > 0.75 [", lambda," = ", lam, "]"),list(lam = l1)),expression(),
substitute(paste("0.5 < imp qual < 0.75 [", lambda," = ", lam, "]"),list(lam = l2)),expression(),
substitute(paste("0.25 imp qual < 0.5 [", lambda," = ", lam, "]"),list(lam = l3)),expression(),
substitute(paste("imp qual < 0.25 [", lambda," = ", lam, "]"),list(lam = l4)),expression()),
pch=c((vector("numeric",6)+1)*23), cex=c((vector("numeric",6)+0.8)), pt.bg=c("red","black","olivedrab","olivedrab1","orange","lightskyblue"))
โ€‹
rm(z)
dev.off()
โ€‹
#sig <- subset(S,S$P <= 1e-2)
#nonsig <- subset(S,S$P > 1e-2)
โ€‹
#sampled <- sample(seq(1,nrow(nonsig),1),500000, replace = FALSE, prob = NULL)
โ€‹
#nonsigout <- nonsig[sampled,]
โ€‹
#p <- rbind(sig,nonsigout)
โ€‹
p <- S
โ€‹
p$POS <- p$POS/100
offset <- 0
color="red"
pos <- c()
pos_odd <- c()
p_odd <- c()
pos_even <- c()
p_even <- c()
xAT <- c()
xE <- c(0)
xO <- c(0)
xEND <- c(0)
โ€‹
#pos <- subset(p$POS,p$CHR == 1)
cols<-rainbow(23)
โ€‹
maxX = 0
for (i in 1:23){
pos_i <- subset(p$POS,p$CHR == i)
maxX = maxX + max(pos_i)
}
โ€‹
#pdf(paste(output,"manhattan.pdf",sep="."), width=16, height=8)
png(paste(output,"manhattan.png",sep="."), width=1500, height=750)
โ€‹
p1 <- subset(p, p$P >= 5e-8)
p2 <- subset(p, p$P < 5e-8)
โ€‹
for (i in 1:23){
pos_i <- subset(p1$POS,p1$CHR == i)
p_i <- subset(p1$P,p1$CHR == i)
โ€‹
pos_j <- subset(p2$POS,p2$CHR == i)
p_j <- subset(p2$P,p2$CHR == i)
โ€‹
โ€‹
if (i == 1){
plot(pos_i,-log10(p_i), pch=15, cex=.5, col="#1E90FF",ylim=c(0,15),xlim=c(0,maxX),xlab="Chromosome",ylab="",main=paste("Genome-wide results",sep=""),axes=F)
points(pos_j + offset,-log10(p_j), pch=18, cex=1, col="#1E90FF",axes=F)
}
if (i %% 2 == 0){
points(pos_i + offset,-log10(p_i), pch=15, cex=.5, col="#104E8B",axes=F)
points(pos_j + offset,-log10(p_j), pch=18, cex=1, col="#104E8B",axes=F)
}
if (i %% 2 == 1){
points(pos_i + offset,-log10(p_i), pch=15, cex=.5, col="#1E90FF",axes=F)
points(pos_j + offset,-log10(p_j), pch=18, cex=1, col="#1E90FF",axes=F)
}
โ€‹
if (color == "red"){
color <- "blue"
pos_odd <- c(pos_odd, pos_i + offset)
p_odd<- c(p_odd, p_i)
xO <- c(xO, (max(pos_i) + min(pos_i))/2 + offset)
}else {
color <- "red"
pos_even <- c(pos_even, pos_i + offset)
p_even <- c(p_even, p_i)
xE <- c(xE, (max(pos_i) + min(pos_i))/2 + offset)
}
โ€‹
โ€‹
pos <- c(pos, pos_i + offset)
xAT <- c(xAT, (max(pos_i) + min(pos_i))/2 + offset)
xEND <- c(xEND, max(pos_i) + offset)
โ€‹
offset <- max(pos)
}
โ€‹
lines(c(min(pos), max(pos)), c(7.3,7.3), lty="dotted", lwd=1, col="black")
mtext("-log10 of P value",side=2, at=7.5, line=1)
for (i in 1:23){
axis(1, at=xAT[i], labels=c(i), cex.axis=1.5,tick=FALSE)
} 
โ€‹
axis(1, at=xEND, labels=c("","","","","","","","","","","","","","","","","","","","","","","",""), tick=TRUE, cex.axis = 0.8) 
axis(2, at=c(0,5,10,15), labels=c(0,5,10,15), pos=c(0,0), las=1) 
โ€‹
โ€‹
โ€‹
dev.off()

Meta-analysis: total tallies over variants

In the summary of the meta-analysis (below) there is a total tally of variants per category. However, this seems to be a bit of (error file reports many more variants that were not found in the reference) and therefore needs double checking.

Example error-file output:

* chrX:86581529:A_G in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.
* chrX:86775594:C_T in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.
* chrX:86777100:C_A in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.
* chrX:86780351:T_C in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.
* chrX:86783054:G_T in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.
* chrX:86783867:G_A in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.
* chrX:86799180:G_T in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.
* chrX:86855260:T_C in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.
* chrX:86859763:T_C in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.
* chrX:86860584:C_A in [ DATA/MODEL1/META/TEMP/meta.all.unique.variants.reorder.split ] is not present in the Variant Annotation File  -- skipping it.

In this example there 4,856 variants skipped. But these are not reported in the summary below as uninformative or something.

Example summary:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Summarizing this meta-analysis.

* Number of variants in meta-analysis       : 122019.
* Number of variants not in the Reference   : 0.
* Number of uninformative variants skipped  : 0.

          Study name     Allele flips     Sign [beta] flips     Informative variants
          ----------     ------------     -----------------     --------------------
   study1                                   0                     0                   122019
   study2                                   0                     0                   106835

Meta-analysis: automatic checking of each cohort

Add in automagical checking of each cohort after cleaning ๐Ÿšง
Although this is typically something you'd want to check by hand, some basic reporting function, of general statistics of the cohort and whether certain steps were successful could be very useful. Again, going over each cohort manually when there are a lot of them is quite some work...

(Plotter) scripts that rely on generating data before queuing slurm command could be sped up

Any part of a script that generates ".sh" files to queue them with sbatch after generating some data neccesary to run these ".sh" files could include generating said data into the ".sh" files themselves.
For example:

	echo "- producing normal QQ-plots..." # P-value
	zcat ${PROJECTDIR}/${COHORTNAME}.${DATAEXT} | ${SCRIPTS}/parseTable.pl --col P | tail -n +2 > ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.txt

	printf "#!/bin/bash\nRscript ${SCRIPTS}/plotter.qq.R --projectdir ${PROJECTDIR} --resultfile ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.txt --outputdir ${PROJECTDIR} --stattype ${STATTYPE} --imageformat ${IMAGEFORMAT}" > ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.sh
	## qsub -S /bin/bash -N ${COHORTNAME}.${DATAPLOTID}.QQ -o ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.log -e ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.errors -l h_vmem=${QMEMPLOTTER} -l h_rt=${QRUNTIMEPLOTTER} -wd ${PROJECTDIR} ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.sh
	QQ_ID=$(sbatch --parsable --job-name=${COHORTNAME}.${DATAPLOTID}.QQ -o ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.log --error ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.errors --time=${QRUNTIMEPLOTTER} --mem=${QMEMPLOTTER} --mail-user=${QMAIL} --mail-type=${QMAILOPTIONS} --chdir=${PROJECTDIR}/ ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.sh)

Would become something like:

	echo "- producing normal QQ-plots..." # P-value
	printf "#!/bin/bash\nzcat ${PROJECTDIR}/${COHORTNAME}.${DATAEXT} | ${SCRIPTS}/parseTable.pl --col P | tail -n +2 > ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.txt\n"
	printf "Rscript ${SCRIPTS}/plotter.qq.R --projectdir ${PROJECTDIR} --resultfile ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.txt --outputdir ${PROJECTDIR} --stattype ${STATTYPE} --imageformat ${IMAGEFORMAT}" >> ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.sh
	## qsub -S /bin/bash -N ${COHORTNAME}.${DATAPLOTID}.QQ -o ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.log -e ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.errors -l h_vmem=${QMEMPLOTTER} -l h_rt=${QRUNTIMEPLOTTER} -wd ${PROJECTDIR} ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.sh
	QQ_ID=$(sbatch --parsable --job-name=${COHORTNAME}.${DATAPLOTID}.QQ -o ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.log --error ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.errors --time=${QRUNTIMEPLOTTER} --mem=${QMEMPLOTTER} --mail-user=${QMAIL} --mail-type=${QMAILOPTIONS} --chdir=${PROJECTDIR}/ ${PROJECTDIR}/${COHORTNAME}.${DATAPLOTID}.QQ.sh)

This should speed things up by making it so that generating the neccesary data becomes part of the sbatch commands, and thus can run at the same time as other similar commands, instead of stopping the script dead in its tracks until the data is generated.

WARNING: This approach will only work if the data generated is used in a single resulting sbatch command, like with the QQ plot in "gwas.plotter.sh". If the data generated is used in multiple commands, like with the Manhattan plots generated in "gwas.plotter.sh" then the data generation will need to stay seperate (although it could still be turned into an sbatch command regardless and turned into a dependancy for the Manhattan plots).

QQ-plot: maximize -log10(P)

Add in option to cut-off the maximum -log10(P), for instance everything p < 5.0e-10 is set to p = 5.0e-10; while at the same time lambdas are calculated based on the original p-values.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.