ilarsf / gwastools Goto Github PK
View Code? Open in Web Editor NEWBasic and fast GWAS functions for QQ and Manhattan plots (incl. gene names)
Basic and fast GWAS functions for QQ and Manhattan plots (incl. gene names)
Hi,
For a collaboration project, we need to make Manhattan plots from our GWAS data. I have been able to use the Rscript ManhattanPlot.r and that given the following message.
Please run the following code line by line in an interactive R session:
library(Map2NCBI)
library(data.table)
file.genelist37 <- path.expand("~/GeneList_Homo_sapiens_BUILD.37.3.txt")
genelist37 <- GetGeneList_v11("Homo sapiens",build="BUILD.37.3",savefiles=F,destfile=tempfile())
fwrite(genelist37,file.genelist37,sep="\t",quote=T)
Unfortunately, the ftp://ftp.ncbi.nih.gov/genomes/MapView for build37 is not available. Do you have other options for the annotation?
Your help will be much appreciated.
If I run the test data included with SAIGE I get the following results:
CHROM POS ID REF ALT AC MAC MAF AF CHISQ BETA SEBETA_GC PVALUE_GC LOG10P_GC
1 4 rs4 2 1 20 20 0.01 0.01 0.00900168133609557 -0.0748360120098867 0.788767156474349 0.924 0.0341342498337596
1 9 rs9 2 1 4 4 0.002 0.002 0.778767920480584 -1.36954622722431 1.5519319320999 0.378 0.42306208504815
1 10 rs10 2 1 4 4 0.002 0.002 0.198787281155206 0.655246816602172 1.46963884563484 0.656 0.183293820523815
1 11 rs11 2 1 2 2 0.001 0.001 0.0982969720215684 -1.10691964528169 3.53057974490197 0.754 0.122696058298277
1 12 rs12 2 1 30 30 0.015 0.015 0.0187906336497365 -0.0897892694469568 0.655018740034906 0.891 0.0501376964185931
1 14 rs14 2 1 75 75 0.0375 0.0375 0.0298836109133599 -0.0747761355054132 0.432560123012224 0.863 0.0641127257257747
1 15 rs15 2 1 244 244 0.122 0.122 1.32622747205384 -0.28363607944272 0.24629322403832 0.249 0.602968677777628
1 17 rs17 2 1 12 12 0.006 0.006 0.00243722719101353 -0.0411351808952507 0.833230970885822 0.961 0.0174457511091217
1 20 rs20 2 1 762 762 0.381 0.381 0.000120928933151418 -0.00181338599534426 0.164901705220366 0.991 0.00382730415769189
1 22 rs22 2 1 459 459 0.2295 0.2295 0.559747345486054 0.144922266593042 0.193704226331097 0.454 0.342597941239989
But if I run your script on these data I get the message:
Error in seq.default(2, length(chrLab), by = 2) : wrong sign in 'by' argument Calls: axis -> seq -> seq.default Execution halted
I do not know if fixing this is worth your time, but I thought I should report the error :)
I just downloaded the updated scripts (my old was probably several years old), but I have an issue with the Manhattan plot script.
The error message I get is
Error in candidateRegions$POS[r] <- rhits$POS[which.max(rhits$LOG10P)] :
replacement has length zero
Calls: ManhattanPlot
In addition: Warning message:
In as.data.table.list(x, keep.rownames = keep.rownames, check.names = check.names, :
Item 3 has 0 rows but longest item has 1; filled with NA
I have tried to run the script on two phenotypes, where it works on one, but not the other. I have gone through the files, but I am not able to find anything wrong; all columns have value within expected ranges, with no NA. They should also be sorted on chr - pos. For the phenotype giving an error, I tried with the old script I had on the cluster, and it works fine.
This is the first ten lines of the input file (which is tab-separated)
CHR POS P MAF
1 85022 0.99079011268437 0.00404269760474563
1 135195 0.52322468045994 0.00930232927203178
1 233476 0.44981101380145 0.00814521033316851
1 233582 0.286454369491075 0.00564469257369637
1 554968 0.294178441484736 0.00653086602687836
1 569994 0.615267901422401 0.00595780368894339
1 574151 0.622304870569572 0.0127126947045326
1 581812 0.302093715615789 0.0190441608428955
1 648843 0.643641380536429 0.00501822307705879
This is the code I am running: Rscript ManhattanPlot.r --input test.txt --prefix test --chr CHR --pos POS --pvalue P
It guess it must be something wrong with the input file, that is tolerated by the old script, but not the new one, but I have not been able to figure out by looking through the code or the inputfile...
Do you have any ideas what your updated script is not tolerating?
Thankful for any help!
Edit: The only difference I could find between the two phenotypes, is that the phenotype that did not work, did not have any significant SNPs. For the file where the script worked, I removed the significant SNP(s). Then running the Manhattan plot script again I get the error. Could it be a bug in the script that requires the resultfile to contain significant SNPs?
If you included the file ExampleGWAS.txt
in the repository it would be much easier for others to understand how to use the tool. I cannot see that the format of the input file is described anywhere.
Thanks for making the code available btw, I really appreciate it ๐
I have successfully included a hitregions file in an older version of the Manhattan plot, but in the newest version i get an error message:
Error in merge.data.table(plotdata[, .(CHROM, POS, x, y)], candidateRegions, :
Elements listed in `by` must be valid column names in x and y
Calls: ManhattanPlot -> merge -> merge.data.table
This is the tabseparated candidate region file (I tried to change the last column name from LEGENDTEXT to LABEL, as indicated in the --help function), but nothing changed
CHROM START END COL LABEL
8 65984044 66384044 green Novel_locus
7 20490738 20890738 green Novel_locus
12 22975219 24975219 blue Known_locus
8 129718772 131718772 blue Known_locus
18 49379032 51379032 blue Known_locus
This is the code I am running
Rscript ManhattanPlot.r --input plots/low_back_pain_H23_forplots.txt --prefix test --chr CHR --pos POS --pvalue p.value --hitregion hitregions.txt --maintitle 'test'
I might have missed an update on how to make the hitregion file, but the only difference I could see by comparing the two scripts, is the header of the last column (LEGENDTEXT vs. LABEL). The excact same code and files (except for changing the header of the last column) works when running the older script. It also works fine when dropping the --hitregion option.
Sorry if I have missed something obvious.
Thanks,
Sigrid
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.