Giter Club home page Giter Club logo

magma_celltyping's Introduction


License: GPL-3
R build status

Authors: Brian Schilder, Alan Murphy, Julien Bryois, Nathan Skene

README updated: Apr-12-2023

Introduction

This R package contains code used for testing which cell types can explain the heritability signal from GWAS summary statistics. The method was described in our 2018 Nature Genetics paper.

This package takes GWAS summary statistics + single-cell transcriptome specificity data (in EWCE’s CellTypeData format) as input. It then calculates and returns the enrichment between the GWAS trait and the cell-types.

Installation

R

Install MAGMA.Celltyping as follows:

if(!require("remotes")) install.packages("remotes")

remotes::install_github("neurogenomics/MAGMA_Celltyping")
library(MAGMA.Celltyping)

MAGMA

MAGMA.Celltyping now installs the command line software MAGMA automatically when you first use a function that relies on MAGMA (e.g. celltype_associations_pipeline). If you prefer, you can later install other versions of MAGMA with:

MAGMA.Celltyping::install_magma(desired_version="<version>",
                                update = TRUE)

Documentation

Using older versions

With the release of MAGMA_Celltyping 2.0 in January 2022, there have been a number of major updates and bug fixes.

  • Only R>4.0.0 is supported. To use this package with older versions of R, install with:remotes::install_github("neurogenomics/MAGMA_Celltyping@01a9e53")

Bugs/fixes

Having trouble? Search the Issues or submit a new one.

Want to contribute new features/fixes? Pull Requests are welcomed!

Both are most welcome, we want the package to be easy to use for everyone!

Citations

If you use the software then please cite:

Skene, et al. Genetic identification of brain cell types underlying schizophrenia. Nature Genetics, 2018.

The package utilises the MAGMA software developed in the Complex Trait Genetics Lab at VU university (not us!) so please also cite:

de Leeuw, et al. MAGMA: Generalized gene-set analysis of GWAS data. PLoS Comput Biol, 2015.

If you use the EWCE package as well then please cite:

Skene, et al. Identification of Vulnerable Cell Types in Major Brain Disorders Using Single Cell Transcriptomes and Expression Weighted Cell Type Enrichment. Front. Neurosci, 2016.

If you use MungeSumstats to format your summary statistics then please cite:

Murphy, Schilder, & Skene, MungeSumstats: a Bioconductor package for the standardization and quality control of many GWAS summary statistics, Bioinformatics, Volume 37, Issue 23, 1 December 2021, Pages 4593–4596, https://doi.org/10.1093/bioinformatics/btab665

If you use the cortex/hippocampus single cell data associated with this package then please cite the following papers:

Zeisel, et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science, 2015.

If you use the midbrain and hypothalamus single cell datasets associated with the 2018 paper then please cite the following papers:

La Manno, et al. Molecular Diversity of Midbrain Development in Mouse, Human, and Stem Cells. Cell, 2016.

Romanov, et al. Molecular interrogation of hypothalamic organization reveals distinct dopamine neuronal subtypes. Nature Neuroscience, 2016.


Contact

UK Dementia Research Institute
Department of Brain Sciences
Faculty of Medicine
Imperial College London
GitHub
DockerHub


magma_celltyping's People

Contributors

al-murphy avatar alexandruioanvoda avatar amcalejandro avatar bschilder avatar kant avatar nathanskene avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

magma_celltyping's Issues

error while processing mouse data

getting zero gene ortholog

ctd2 = prepare.quantile.groups(ctd,specificity_species = "mouse",
                               gwas_species = "human", numberOfBins=40)
[1] "Dropping all genes that do not have 1:1 homologs between the two species"
[1] "Species for which homology classes are available:"
character(0)
[1] "Available species options:"
[1] all    simple
<0 rows> (or 0-length row.names)
Error in check.species(requestedSpecies = species1, allSpecies = allHomologs$`Common Organism Name`): ERROR: species does not match available options
Traceback:

1. prepare.quantile.groups(ctd, specificity_species = "mouse", gwas_species = "human", 
 .     numberOfBins = 40)
2. One2One::analyse.orthology(specificity_species, gwas_species, 
 .     allHomologs)   # at line 7 of file <text>
3. check.species(requestedSpecies = species1, allSpecies = allHomologs$`Common Organism Name`)
4. stop("ERROR: species does not match available options")
5. 

While checking the code, it may be associated with One2One packages load.homologs.r

    hom_vert = read.table("http://www.informatics.jax.org/downloads/reports/HOM_AllOrganism.rpt",sep="\t",stringsAsFactors = FALSE,quote="") ```

 as the column header has been changed. Not sure what was the previous and how that column handled for the downstream for debugging.







Installation

Hi! I was installing the latest version on a node, and this error came out (using either devtools::install_github and remotes::install_github):

remotes::install_github("nathanskene/MAGMA_Celltyping") Downloading GitHub repo nathanskene/MAGMA_Celltyping@master These packages have more recent versions available. Which would you like to update? 1: colorspace (1.4-0 -> 1.4-1 ) [CRAN] 2: XML (3.98-1.16 -> 3.98-1.19) [CRAN] 3: CRAN packages only 4: All 5: None Enter one or more numbers separated by spaces, or an empty line to cancel 1: 5 ✔ checking for file ‘/tmp/RtmpTMiZNJ/remotes14476b11eefb/NathanSkene-MAGMA_Celltyping-bdab1fa/DESCRIPTION’ ... ─ preparing ‘MAGMA.Celltyping’: ✔ checking DESCRIPTION meta-information ... Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:31: unexpected section header '\value' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:34: unexpected section header '\description' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:37: unexpected section header '\examples' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:45: unexpected END_OF_INPUT ' ' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/load.magma.results.file.Rd:13: unknown macro '\item' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/load.magma.results.file.Rd:15: unknown macro '\item' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/load.magma.results.file.Rd:17: unknown macro '\item' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/load.magma.results.file.Rd:19: unknown macro '\item' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/load.magma.results.file.Rd:21: unexpected section header '\value' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/load.magma.results.file.Rd:24: unexpected section header '\description' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/load.magma.results.file.Rd:27: unexpected section header '\examples' Warning: /tmp/Rtmpm8tgPq/Rbuild188917afa83d/MAGMA.Celltyping/man/load.magma.results.file.Rd:31: unexpected END_OF_INPUT ' ' ─ checking for LF line-endings in source and make files and shell scripts ─ checking for empty or unneeded directories ─ looking to see if a ‘data/datalist’ file should be added Warning: file ‘ctd_AIBS.rda’ has magic number 'versi' Use of save versions prior to 2 is deprecated unable to create a ‘datalist’ file: may need the package to be installed ─ building ‘MAGMA.Celltyping_0.99.0.tar.gz’ Warning: invalid uid value replaced by that for user 'nobody' Installing package into ‘/gfs/devel/avoda/R/x86_64-pc-linux-gnu-library/3.5’ (as ‘lib’ is unspecified) * installing source package ‘MAGMA.Celltyping’ ... ** R ** data *** moving datasets to lazyload DB Warning: file ‘ctd_AIBS.rda’ has magic number 'versi' Use of save versions prior to 2 is deprecated Error in load(zfile, envir = envir) : bad restore file magic number (file may be corrupted) -- no data loaded ERROR: lazydata failed for package ‘MAGMA.Celltyping’ * removing ‘/gfs/devel/avoda/R/x86_64-pc-linux-gnu-library/3.5/MAGMA.Celltyping’ Error in i.p(...) : (converted from warning) installation of package ‘/tmp/RtmpTMiZNJ/file144722b7defb/MAGMA.Celltyping_0.99.0.tar.gz’ had non-zero exit status
--
 
| >

The genomeLocFile now directs to /path/to/MAGMA.Celltyping/data//NCBI*

Sent by Mitchell:

While running some analyses in MAGMA.Celltyping v1.08, I came across something that I believe is a minor bug.

It's in the function map.snps.to.genes(), under the section

Determine which genome build it uses & get path to gene loc file

genome_build = get_genomebuild_for_sumstats(path_formatted)
gene_loc_dir = sprintf("%s/data/",system.file(package="MAGMA.Celltyping"))
if(genome_build == "GRCh37"){genomeLocFile=sprintf("%s/NCBI37.3.gene.loc",gene_loc_dir)}
if(genome_build == "GRCh38"){genomeLocFile=sprintf("%s/NCBI38.gene.loc",gene_loc_dir)}

The genomeLocFile now directs to /path/to/MAGMA.Celltyping/data//NCBI*
Besides the double slashes, the file is actually located in /path/to/MAGMA.Celltyping/extdata/NCBI*

Sample ctd datasets not linked

Hello, I was trying to load the ctd_AIBS dataset, but instead it loads ctd (with the 24 level 1 and 149 level 2 of the original ctd). Related, when trying to build (Ubuntu 18.04, R3.5.2, 64bit) I get the following pertinent error:

Warning: objects ‘ctd’, ‘ctd’, ‘ctd’ are created by more than one data call

Here is the full build log, which looks like it may encounter another issue with calculate.celltype.associtations that perhaps supersedes this one:

> install_github('nathanskene/MAGMA_Celltyping', force = TRUE)
Downloading GitHub repo nathanskene/MAGMA_Celltyping@master
✔  checking for file ‘/tmp/RtmptlCvgD/remotes17616eeb5fc7/NathanSkene-MAGMA_Celltyping-6bb417d/DESCRIPTION’ ...
─  preparing ‘MAGMA.Celltyping’:
✔  checking DESCRIPTION meta-information ...
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:31: unexpected section header '\value'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:34: unexpected section header '\description'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:37: unexpected section header '\examples'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:45: unexpected END_OF_INPUT '
   '
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/load.magma.results.file.Rd:13: unknown macro '\item'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/load.magma.results.file.Rd:15: unknown macro '\item'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/load.magma.results.file.Rd:17: unknown macro '\item'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/load.magma.results.file.Rd:19: unknown macro '\item'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/load.magma.results.file.Rd:21: unexpected section header '\value'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/load.magma.results.file.Rd:24: unexpected section header '\description'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/load.magma.results.file.Rd:27: unexpected section header '\examples'
   Warning: /tmp/Rtmpiyr4Zp/Rbuild17a44f57a477/MAGMA.Celltyping/man/load.magma.results.file.Rd:31: unexpected END_OF_INPUT '
   '
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  looking to see if a ‘data/datalist’ file should be added
─  building ‘MAGMA.Celltyping_0.99.0.tar.gz’ (4.9s)
   
Installing package into ‘/home/dragon951/R/x86_64-pc-linux-gnu-library/3.5’
(as ‘lib’ is unspecified)
* installing *source* package ‘MAGMA.Celltyping’ ...
** R
** data
*** moving datasets to lazyload DB
Warning: objects ‘ctd’, ‘ctd’, ‘ctd’ are created by more than one data call
** byte-compile and prepare package for lazy loading
Warning: replacing previous import ‘data.table::last’ by ‘dplyr::last’ when loading ‘MAGMA.Celltyping’
Warning: replacing previous import ‘data.table::first’ by ‘dplyr::first’ when loading ‘MAGMA.Celltyping’
Warning: replacing previous import ‘data.table::between’ by ‘dplyr::between’ when loading ‘MAGMA.Celltyping’
Warning: replacing previous import ‘cowplot::ggsave’ by ‘ggplot2::ggsave’ when loading ‘MAGMA.Celltyping’
Warning: replacing previous import ‘dplyr::combine’ by ‘gridExtra::combine’ when loading ‘MAGMA.Celltyping’
** help
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:31: unexpected section header '\value'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:34: unexpected section header '\description'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:37: unexpected section header '\examples'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:45: unexpected END_OF_INPUT '
'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/load.magma.results.file.Rd:13: unknown macro '\item'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/load.magma.results.file.Rd:15: unknown macro '\item'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/load.magma.results.file.Rd:17: unknown macro '\item'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/load.magma.results.file.Rd:19: unknown macro '\item'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/load.magma.results.file.Rd:21: unexpected section header '\value'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/load.magma.results.file.Rd:24: unexpected section header '\description'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/load.magma.results.file.Rd:27: unexpected section header '\examples'
Warning: /tmp/RtmpDY2W3Z/R.INSTALL17b147322cb2/MAGMA.Celltyping/man/load.magma.results.file.Rd:31: unexpected END_OF_INPUT '
'
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
Warning: replacing previous import ‘data.table::last’ by ‘dplyr::last’ when loading ‘MAGMA.Celltyping’
Warning: replacing previous import ‘data.table::first’ by ‘dplyr::first’ when loading ‘MAGMA.Celltyping’
Warning: replacing previous import ‘data.table::between’ by ‘dplyr::between’ when loading ‘MAGMA.Celltyping’
Warning: replacing previous import ‘cowplot::ggsave’ by ‘ggplot2::ggsave’ when loading ‘MAGMA.Celltyping’
Warning: replacing previous import ‘dplyr::combine’ by ‘gridExtra::combine’ when loading ‘MAGMA.Celltyping’
No arguments specified. Please consult manual for usage instructions.

Exiting MAGMA. Goodbye.
Warning in system2("magma", "--v", stdout = TRUE, stderr = TRUE) :
  running command ''magma' --v 2>&1' had status 2
* DONE (MAGMA.Celltyping)
>

Error in if (genome_build == "GRCh37") { : argument is of length zero

When mapping SNPs to gene, I run across the following error message
Error in if (genome_build == "GRCh37") { : argument is of length zero

I read your previous post on this issue and seems the issue was resolved, but still, I am getting the same error on my MAC. Any idea where I could be going wrong?

Magma celltyping on snATAC data

Hi,

I have used the package on scRNA data. I am wondering if I can also use sc/snATAC seq data with the package. I believe the package uses the gene names at some point. So, if I want to use ATAC data how can I proceed. Any insights will be helpful.

Thanks in advance...!!!

Add an error catch to prepare.quantile.groups

Add error to catch instances where the specificity_quantiles are not proper quantiles (e.g. you asked for 40, did you get 40?). This can occur when the mean expression matrix is far too sparse. E.g. this is what the quantiles should look like:

image

This is what they shouldn't look like:

image

These plots are basically generated with:

hist(newCTD2[[1]]$specificity_quantiles[,"oligondendrocyte"])

Reduce functions in namespace

A lot of functions that most people wouldn't use in practice are exported. This crowds the namespace and makes it harder to find key functions.

Making functions that only ever get called internally as internal.

Missing parameter error

Was running this:

library(MAGMA.Celltyping)
data(ctd_DRONC_human)
ctd <- ctd_DRONC_human
rm(ctd_DRONC_human)
ctd = prepare.quantile.groups(ctd,specificity_species="human",numberOfBins=40)

And got this error:

ctd = prepare.quantile.groups(ctd,specificity_species="human",numberOfBins=40)
Error in apply(ctdIN$specificity, 2, FUN = bin.columns.into.quantiles, :
argument "numberOfBins" is missing, with no default

Apparently it is because of this line, where numberOfBins isn't (and should be) passed:
https://github.com/NathanSkene/MAGMA_Celltyping/blob/ddc3c685cffdef9377660389bf87573cefd404dd/R/prepare.quantile.groups.r#L30

map.snps.to.genes error: object 'read_header' not found

Hello,

Thank you for this excellent package.

I got an error when I was mapping the SNPs to the genes:

Error in MAGMA.Celltyping::map.snps.to.genes(path_formatted = gwas_sumstats_path_formatted, :
object 'read_header' not found.

The formatted GWAS sum stats is:
SNP CHR BP P A1 A2 BETA
rs10070308 5 107281621 2.77e-09 C T 0.0994
rs10154963 3 150169559 1.809e-08 C T 0.0765
rs10198789 2 28968811 4.45e-14 C T 0.0929
rs10219645 12 48529737 6.906e-09 C T 0.0756
rs10255049 7 56121304 1.022e-26 G A 0.1404
rs10406053 19 13987186 1.972e-11 G C 0.1021
rs10409547 19 50402819 1.027e-09 G T 0.0926
rs1044595 1 180943529 1.273e-26 T C 0.1323
rs10477172 5 141682090 2.844e-14 T C 0.0919
rs10521305 16 53908484 9.616e-20 C T 0.2628
rs10734411 11 32541784 1.159e-26 G A 0.1294
rs10743271 12 7353790 7.42e-11 T C 0.0891
rs10743724 12 30776022 3.484e-10 C T 0.0759
rs10764106 10 37055384 1.593e-10 C T 0.0775
rs10769315 11 48110367 4.277e-08 T C 0.0815
rs10795520 10 5764160 6.655e-15 G A 0.0965
rs10804920 3 189438689 5.984e-12 T C 0.0842
rs10818873 9 126559993 1.241e-18 C T 0.2184
rs10823203 10 70219610 2.865e-26 G C 0.1589
rs10854167 20 61533039 1.439e-32 G C 0.1752

Thanks,

Chen

Bug in check_inputs_to_magma_celltype_analysis

Found a fatal bug (fortunately, easily fixable) in check_inputs_to_magma_celltype_analysis() while running the tutorial.

> ctAssocsLinear = calculate_celltype_associations(ctd,gwas_sumstats_path,genome_ref_path=genome_ref_path,specificity_species = "human")
Error in check_inputs_to_magma_celltype_analysis(ctd, gwas_sumstats_path,  : 
  CTD should have quantiles. Send to 'prepare.quantile.groups' before calling this function.

I was confused, because I already ran ctd = prepare.quantile.groups(ctd,specificity_species="human",numberOfBins=40) successfully, so I looked further:

> names(ctd[[2]])
[1] "specificity"                "mean_exp"                   "linear_normalised_mean_exp" "specificity_quantiles"      "spec_dist"                 
[6] "specDist_quantiles"        
> "quantiles" %in% names(ctd[[2]])
[1] FALSE
> "specificity_quantiles" %in% names(ctd[[2]])
[1] TRUE

So there you have it, https://github.com/NathanSkene/MAGMA_Celltyping/blob/a844adf590af9f25014a319626484f124e046a88/R/check_inputs_to_magma_celltype_analysis.r#L26

This line ^ needs to be modified into: if(!"specificity_quantiles" %in% names(ctd[[annotLevel]])){stop("CTD should have quantiles. Send to 'prepare.quantile.groups' before calling this function.")}

enrichment_p or P?

Thanks Nathan for sharing the codes! I was able to successfully run the pipeline. However, I am confused about the output from LDSC. Would you use the enrichment_p or the P (1-pnorm(Coefficient_z-score) to assess if a cell type has significant association with the trait?
Many thanks!
Zhang

Checking labelling of "-log10 P-value" on plot x-axis

Hi,

If I understand it correctly, association analysis results stores a log10 transformed copy of the P-value in the output file. However, when these P-values are plotted, the x-axis legend indicates '-log10(pvalue)', shouldn't this be 'log10(pvalue)'?

Avoid duplicated Roxygen notes

Rather than copy and pasting the same Roxygen notes for each function, document one main function and use the #' @inheritParams main_function syntax.

This makes it far less tedious to document functions and ensures that outdated documentation doesn't accidentally get kept after changes are made.

magma: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by magma)

Hello,

I installed MAGMA_Celltyping on an anaconda environment at HPC (Imperial College). The magma command line tool (the latest version) was downloaded and seems to be available. However, when I try to run the map.snps.to.genes function I get this error:

genesOutPath = map.snps.to.genes(path_formatted=gwas_sumstats_path_formatted,genome_build="GRCh37",
                            
                                 genome_ref_path=genome_ref_path)


API: public: http://gwas-api.mrcieu.ac.uk/
Reading header.
[1] "GWAS Sumstats appear to come from genome build: GRCh37"
magma: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by magma)
magma: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by magma)
magma: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by magma)
magma: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by magma)
magma: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by magma)
magma: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by magma)

With my limited knowledge on these matters, I get that the package is looking in the lib64 folder of the HPC system instead of the ones in the anaconda environment (which I believe include the missing elements that give this error).

Thank you very much,

Stergios

R version 4.1.1 (2021-08-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /rds/general/user/stsartsa/home/anaconda3/envs/R_magma/lib/libopenblasp-r0.3.17.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] MAGMA.Celltyping_1.0.1 EWCE_1.1.1             RNOmni_1.0.0          

loaded via a namespace (and not attached):
  [1] colorspace_2.0-2                        
  [2] rjson_0.2.20                            
  [3] ellipsis_0.3.2                          
  [4] XVector_0.32.0                          
  [5] GenomicRanges_1.44.0                    
  [6] ggdendro_0.1.22                         
  [7] fs_1.5.0                                
  [8] rstudioapi_0.13                         
  [9] listenv_0.8.0                           
 [10] bit64_4.0.5                             
 [11] interactiveDisplayBase_1.30.0           
 [12] AnnotationDbi_1.54.1                    
 [13] fansi_0.4.2                             
 [14] xml2_1.3.2                              
 [15] codetools_0.2-18                        
 [16] R.methodsS3_1.8.1                       
 [17] One2One_0.1.1                           
 [18] cachem_1.0.6                            
 [19] jsonlite_1.7.2                          
 [20] Rsamtools_2.8.0                         
 [21] dbplyr_2.1.1                            
 [22] R.oo_1.24.0                             
 [23] png_0.1-7                               
 [24] SNPlocs.Hsapiens.dbSNP144.GRCh37_0.99.20
 [25] SNPlocs.Hsapiens.dbSNP144.GRCh38_0.99.20
 [26] shiny_1.6.0                             
 [27] BiocManager_1.30.16                     
 [28] compiler_4.1.1                          
 [29] httr_1.4.2                              
 [30] assertthat_0.2.1                        
 [31] Matrix_1.3-4                            
 [32] fastmap_1.1.0                           
 [33] gargle_1.2.0                            
 [34] cli_3.0.1                               
 [35] limma_3.48.3                            
 [36] later_1.2.0                             
 [37] htmltools_0.5.2                         
 [38] prettyunits_1.1.1                       
 [39] tools_4.1.1                             
 [40] gtable_0.3.0                            
 [41] glue_1.4.2                              
 [42] GenomeInfoDbData_1.2.6                  
 [43] reshape2_1.4.4                          
 [44] dplyr_1.0.7                             
 [45] rappdirs_0.3.3                          
 [46] Rcpp_1.0.7                              
 [47] Biobase_2.52.0                          
 [48] vctrs_0.3.8                             
 [49] Biostrings_2.60.2                       
 [50] ExperimentHub_2.0.0                     
 [51] rtracklayer_1.52.1                      
 [52] stringr_1.4.0                           
 [53] globals_0.14.0                          
 [54] MungeSumstats_1.1.24                    
 [55] mime_0.11                               
 [56] lifecycle_1.0.0                         
 [57] restfulr_0.0.13                         
 [58] XML_3.99-0.7                            
 [59] future_1.22.1                           
 [60] googleAuthR_1.4.0                       
 [61] AnnotationHub_3.0.1                     
 [62] zlibbioc_1.38.0                         
 [63] MASS_7.3-54                             
 [64] scales_1.1.1                            
 [65] BSgenome_1.60.0                         
 [66] VariantAnnotation_1.38.0                
 [67] hms_1.1.0                               
 [68] promises_1.2.0.1                        
 [69] MatrixGenerics_1.4.3                    
 [70] parallel_4.1.1                          
 [71] SummarizedExperiment_1.22.0             
 [72] yaml_2.2.1                              
 [73] curl_4.3.2                              
 [74] memoise_2.0.0                           
 [75] gridExtra_2.3                           
 [76] ggplot2_3.3.5                           
 [77] biomaRt_2.48.3                          
 [78] stringi_1.7.4                           
 [79] RSQLite_2.2.8                           
 [80] BiocVersion_3.13.1                      
 [81] S4Vectors_0.30.0                        
 [82] BiocIO_1.2.0                            
 [83] GenomicFeatures_1.44.2                  
 [84] BiocGenerics_0.38.0                     
 [85] filelock_1.0.2                          
 [86] BiocParallel_1.26.2                     
 [87] GenomeInfoDb_1.28.4                     
 [88] rlang_0.4.11                            
 [89] pkgconfig_2.0.3                         
 [90] matrixStats_0.60.1                      
 [91] bitops_1.0-7                            
 [92] lattice_0.20-44                         
 [93] purrr_0.3.4                             
 [94] GenomicAlignments_1.28.0                
 [95] cowplot_1.1.1                           
 [96] bit_4.0.4                               
 [97] tidyselect_1.1.1                        
 [98] parallelly_1.28.1                       
 [99] plyr_1.8.6                              
[100] magrittr_2.0.1                          
[101] R6_2.5.1                                
[102] IRanges_2.26.0                          
[103] generics_0.1.0                          
[104] ewceData_1.0.0                          
[105] DelayedArray_0.18.0                     
[106] DBI_1.1.1                               
[107] pillar_1.6.2                            
[108] KEGGREST_1.32.0                         
[109] RCurl_1.98-1.4                          
[110] tibble_3.1.4                            
[111] crayon_1.4.1                            
[112] utf8_1.2.2                              
[113] BiocFileCache_2.0.0                     
[114] progress_1.2.2                          
[115] usethis_2.0.1                           
[116] grid_4.1.1                              
[117] data.table_1.14.0                       
[118] blob_1.2.2                              
[119] digest_0.6.27                           
[120] xtable_1.8-4                            
[121] HGNChelper_0.8.1                        
[122] httpuv_1.6.3                            
[123] R.utils_2.10.1                          
[124] stats4_4.1.1                            
[125] munsell_0.5.0                           


Make upstream_kb/downstream_kb defaults

was previously

 upstream_kb = 10,
 downstream_kb = 1.5,

is now

 upstream_kb = 35,
 downstream_kb = 10,

This is more consistent with what we've used in different studies. I've made this the default consistently across all functions.

Going through the example

Hi Nathan,

As I am going through the section Download summary statistics file & check it is properly formatted, some of the files are missing such as "https://www.dropbox.com/s/shsiq0brkax886j/20016.assoc.tsv.gz". Could you upload the file for us?

Also, for this snippet in the Download and prepare the 'Prospective memory' GWAS summary statistics section:
gwas_sumstats_path = "~/GWAS_Summary_Statistics/20018.assoc.tsv" if(!file.exists(gwas_sumstats_path)){ download.file("https://www.dropbox.com/s/shsiq0brkax886j/20016.assoc.tsv.gz?raw=1",destfile=sprintf("%s.gz",gwas_sumstats_path)) gunzip(sprintf("%s.gz",gwas_sumstats_path),gwas_sumstats_path) }

It seems like you are downloading 20018.assoc.tsv but the dropbox link is pointing towards 20016.assoc.tsv.gz. May I know which one is correct and will be great for you to upload 20018 as well.

Another thing that I came across is that when I run this code in R:
ctd = prepare.quantile.groups(ctd,specificity_species="mouse",numberOfBins=40)<br /> print(ctd[[1]]$quantiles[c("Gfap","Dlg4","Aif1"),])<br /> print(table(ctd[[1]]$quantiles[,1]))

The second line returned NULL and the third line returned < table of extent 0 >. The ctd does contain information and the output from seems to be correct:

> ctd = prepare.quantile.groups(ctd,specificity_species="mouse",numberOfBins=40)<br /> [1] "Dropping all genes that do not have 1:1 homologs between the two species" [1] "Species for which homology classes are available:" [1] "mouse, laboratory" "human" "chimpanzee" "macaque, rhesus". [5] "dog, domestic" "cattle" "rat" "frog, western clawed" [9] "zebrafish" "chicken" [1] "Selected species: mouse, laboratory" [1] "Selected species: human" [1] "Full dataset contains 18981 genes from mouse" [1] "Full dataset contains 18708 genes from human" [1] "2231 genes which are present in mouse are deleted in human" [1] "1958 genes which are present in human are deleted in mouse" [1] "605 genes are duplicated in mouse" [1] "220 genes are duplicated in human" [1] "16468 are shared 1:1 between the two species"

Let me know and thanks!

Missing function `format_sumstats_for_magma_crossplatform()`

Hi there,

In the tutorial if references a user-provided function, format_sumstats_for_magma_crossplatform() but i can't seem to find this in the package.

Also, when I run format_sumstats_for_magma() it can't find the SNP column in the example (called "variant"). I think this col name is common enough it might be worth adding to the list of possible names for the SNP col.

Thanks,
Brian

Add functions to find intersection of genes with high genetic load + cell type specificity (similar to approach used in Parkinsons nature genetics paper)

Dear Nathan,

Thank you for sharing your code. I am an iBSC student at UCL and have been using this pipeline to identify association between different liver disease and cell types and have got some interesting data. I was wondering whether it would be possible to obtain the intersection of genes between the gwas sum stats and the cell type data, in essence the genes driving the association?

"There is no SNP column found within the data." Error in load(filePath) : empty (zero-byte) input file

Hi Nathan,

Thank you for this excellent package.

I am running into an issue whereby after running the formatting function and attempting to map the SNPs from the GWAS to the genes it is giving me the following error:

There is no N column within the sumstats file. What is the N value for this GWAS?280897
[1] "There is no SNP column found within the data. It must be inferred from CHR and BP information."
Error in load(filePath) : empty (zero-byte) input file

I am using the most up-to-date R 4.1 and packages for MAGMA.Celltyping and EWCE

Reards,

Angus

How to build the single-cell expression matrix

Dear Nathan,

I read the example about MAGMA_celltyping, I can not understand how to build the expression matrix (such as ctd in your data). I only use one brain region, and do not need to merge. Could you give some advice to me? Thank you so much!

Best,

Bo

Handling other matrix types

MAGMA.Celltyping works for dense base class matrices, but not for sparse (Dgcmatrix) and/or DelayedArrray matrices.

For example in prepare.quantile.groups(), this subfunction only works for dense matrices:

 bin.specificityDistance.into.quantiles <- function(spcMatrix){
        spcMatrix$specDist_quantiles = apply(spcMatrix$spec_dist,2,FUN=EWCE::bin.columns.into.quantiles)
        rownames(spcMatrix$specDist_quantiles) = rownames(spcMatrix$spec_dist)
        return(spcMatrix)
    }    

But this works for matrices more generally (note the c() step):

 bin.specificityDistance.into.quantiles <- function(spcMatrix) {
    spcMatrix$specDist_quantiles = apply(spcMatrix$spec_dist, 
      2, FUN = function(x){
        EWCE::bin.columns.into.quantiles(c(x))
      }) 
    rownames(spcMatrix$specDist_quantiles) = rownames(spcMatrix$spec_dist)  
    return(spcMatrix)
  }

So MAGMA.Celltyping could either be updated to handle other matrix types, or some code could be added to prepare.quantile.groups() to ensure all matrices are dense base matrix format.

Installation Error: package ‘EWCE’ could not be loaded

Hi,

I used this package few months back. I didn't have any problem installing or working on it. But recently We updated to R(3.6.0) and I had to install the package again. For some reason, I am getting an error, the error message says package " ‘EWCE’ could not be loaded" but I have EWCE package currently installed and when I try to open it using library(EWCE) it works fine. I do not know why installation of MAGMA_Celltyping is not working. Below attached is the error message:

Note: while installing it also says "Skipping 1 packages not available: XML" but I have the XML package installed already(but the version that supports R 3.6.0).

Please do let me know if you need any further details.

`> devtools::install_github("nathanskene/MAGMA_Celltyping")
Downloading GitHub repo nathanskene/MAGMA_Celltyping@HEAD
Skipping 1 packages not available: XML
✓ checking for file ‘/home/sama/Desktop/tmp/RtmpnJ0NgC/remotesddfe569d5550/NathanSkene-MAGMA_Celltyping-7300011/DESCRIPTION’ (342ms)
─ preparing ‘MAGMA.Celltyping’:
✓ checking DESCRIPTION meta-information ...
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:38: unexpected section header '\value'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:41: unexpected section header '\description'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:44: unexpected section header '\examples'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/calculate_celltype_associations.Rd:52: unexpected END_OF_INPUT '
'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/load.magma.results.file.Rd:19: unknown macro '\item'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/load.magma.results.file.Rd:21: unknown macro '\item'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/load.magma.results.file.Rd:23: unknown macro '\item'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/load.magma.results.file.Rd:25: unknown macro '\item'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/load.magma.results.file.Rd:27: unexpected section header '\value'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/load.magma.results.file.Rd:30: unexpected section header '\description'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/load.magma.results.file.Rd:33: unexpected section header '\examples'
Warning: /home/sama/Desktop/tmp/Rtmp2l4fnt/Rbuildc94465be3c32/MAGMA.Celltyping/man/load.magma.results.file.Rd:37: unexpected END_OF_INPUT '
'
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ looking to see if a ‘data/datalist’ file should be added
─ building ‘MAGMA.Celltyping_0.99.0.tar.gz’ (19.6s)

Installing package into ‘/home/sama/R/x86_64-redhat-linux-gnu-library/3.6’
(as ‘lib’ is unspecified)

  • installing source package ‘MAGMA.Celltyping’ ...
    ** using staged installation
    ** R
    ** data
    *** moving datasets to lazyload DB
    Warning: replacing previous import ‘Matrix::cov2cor’ by ‘stats::cov2cor’ when loading ‘EWCE’
    Warning: replacing previous import ‘Matrix::toeplitz’ by ‘stats::toeplitz’ when loading ‘EWCE’
    Warning: replacing previous import ‘Matrix::update’ by ‘stats::update’ when loading ‘EWCE’
    Warning: replacing previous import ‘Matrix::tail’ by ‘utils::tail’ when loading ‘EWCE’
    Warning: replacing previous import ‘Matrix::head’ by ‘utils::head’ when loading ‘EWCE’
    ** byte-compile and prepare package for lazy loading
    Error: package or namespace load failed for ‘EWCE’:
    (converted from warning) replacing previous import ‘Matrix::cov2cor’ by ‘stats::cov2cor’ when loading ‘EWCE’
    Error: package ‘EWCE’ could not be loaded
    Execution halted
    ERROR: lazy loading failed for package ‘MAGMA.Celltyping’
  • removing ‘/home/sama/R/x86_64-redhat-linux-gnu-library/3.6/MAGMA.Celltyping’
    Error: Failed to install 'MAGMA.Celltyping' from GitHub:
    (converted from warning) installation of package ‘/home/sama/Desktop/tmp/RtmpnJ0NgC/fileddfe41b693a0/MAGMA.Celltyping_0.99.0.tar.gz’ had non-zero exit status`

Thanks in advance...!!!

plot_celltype_associations using old 'quantiles' variable

Hi,

I was running the 2018 CLOZUK GWAS against the KI dataset, and when trying to plot the cell type associations with I noticed the method still uses the old 'quantiles' variable.

# Preparing ctd_allKI dataset
ctd = prepare.quantile.groups(ctd_allKI,
                              specificity_species = 'mouse',
                              numberOfBins = 40)

# Mapping SNPs to genes
scz_genes_out_path = map.snps.to.genes(scz_gwas_sum_stats_path,
                                       genome_ref_path = genome_ref_path)

# Cell type association test
scz_ct_assoc_linear = calculate_celltype_associations(ctd,
                                                      scz_gwas_sum_stats_path,
                                                      genome_ref_path = genome_ref_path,
                                                      specificity_species = 'mouse')

# Plot associations
scz_figs_linear = plot_celltype_associations(scz_ct_assoc_linear,
                                             ctd = ctd)

Up until the last step everything works, SNPs are mapped etc., however after the last step this error occurs:

Error in t.default(ctd[[annotLevel]]$quantiles) : 
  argument is not a matrix

I could circumvent this by changing the names, e.g.:

names(ctd[[1]])[names(ctd[[1]]) == "specificity_quantiles"] <- 'quantiles'
names(ctd[[2]])[names(ctd[[2]]) == "specificity_quantiles"] <- 'quantiles'

However I think this is just a case of the source code still using an old variable name, can this be updated?

Formating GWAS summarys tatistics function giving a sed command error

Hello,
I have been trying to use your package. I have been getting a sed command error while tyring to format my GWAS summary statistics files using your function. I wondered if the sed -i command was coded for MAC OS versions of sed rather than linux GNU sed?

Is it possible to get an alternative fix? as evethough it corrects some formatting bits, but not everything, I get an error while running MAGMA after.

Thanks
Devika

error: magma does not appear to be located on your path

Error: package or namespace load failed for 'MAGMA.Celltyping':
.onLoad failed in loadNamespace() for 'MAGMA.Celltyping', details:
call: fun(libname, pkgname)
error: magma does not appear to be located on your path
Please download it from https://ctg.cncr.nl/software/magma
The executable should then be copied to /usr/local/bin

Hi,
I am trying to install MAGMA.Celltyping to R windows version. I have magma downloaded for windows version too.
But I don't think windows has /usr/local/bin
Could you help modify the script so that the package can also be installed in windows machine ?
Thank you so much.

Avoid accidentally renaming columns

When you turn an object into a data.frame, it by default edits the column/row names (e.g. replaces spaces with "."s). This makes it difficult to later query a particular cell-type of interest because the user doesn't know how the names were changed exactly.

I've gone through and added the following arguments whenever data.frame is called:

df <- data.frame(dat,
                            check.rows = FALSE, 
                            check.names = FALSE)

Identifying the gene load

Thank you for this great pipeline and detailed wiki.
Is there any way one can identify the gens that are driving the enrichment and its weightage in the model?

`Error in if (genome_build == "GRCh37") { : argument is of length zero`

When mapping SNPs to gene with the example GWAS sum stats, I run across the following error message.

genesOutPath <- MAGMA.Celltyping::map.snps.to.genes(gwas_sumstats_path,
                                                    genome_ref_path=genome_ref_path,
                                                    N = 117131)
[1] "There is no SNP column found within the data. It must be inferred from CHR and BP information."
Error in if (genome_build == "GRCh37") { : argument is of length zero

Not sure if this is relevant, but during the format_sumstats_for_magma() step I selected 1 for GRCh37.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.