Giter Club home page Giter Club logo

meth_atlas's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

meth_atlas's Issues

Option to not show plot on screen

Hi,

I am running iteratively across groups, but deconvolve.py with --plot plots to screen and so the iteration stops. Possible to allow --plot to just dump to png, not also to screen? Am I missing something?

Thanks,

Bruce.

Deconvole with ONT Reads

Hello Everyone!

I am working with samples of heart rejection patients. I used ONT to sequence two samples one with a slight rejection and the other with no rejection.
After that i used Guppy to generate .bam files and made .bed files from the bam using Modbam2bed. I map each CpG using the start oder end index to the Pos on the reference Atlas to map each CpG to a specific Illumina ID. And after that i use the Deconvole script.

I currently am just not seeing any left.atrium cells or Adipocytes in the samples. I just see Neutrophiles, Erythrocyte_progenitors and Monocytes_EPIC. The only difference between the samples is between the amount of Neutrophiles. So now i am wondering is there a Error in my approach of mapping or my basic Pipeline ? Am i forgetting any steps ? Cause the deconvolution results do not match what i would accept to see.

Maybe someone has experience with using ONT Data for this Algorithm and can help me out.

Thanks in advance!
kind regards,
Azlan

Deconvolution of whole genome bisulfite sequence data using meth_atlas

Hi, I was wondering if you or someone has used your algorithm and reference atlas to deconvolute whole genome bisulfite methylation sequence data (WGBS)? I tried using it for some samples but, it predicts the incorrect cell or tissue types for these for e.g. for lung tissue (from a healthy control) it predicts that a large portion of the sample contains cortical neurons, breast etc. (no indication of lung). In addition, I tried it with a sample from adipose tissue and it predicts the same composition as the lung. There is sufficient overlap of CpGs with your reference atlas (at various coverages) thus, I don't think this would be an issue. Is there something inherent with the deconvolution algorithm used, tailored towards array data? Just curious, any help will be appreciated. Thank you.

adding another reference to the reference atlas

Hi,

I came across your deconvolution protocol here and was very much interested in applying it to my tumor samples. However, there is one reference missing in the reference atlas that I would need: alpha cells of pancreas. I wish to incorporate the reference into the atlas. Do you provide a protocol (similar to the sample preparation script) that one can use to optimize the reference file with a new file?

I noticed in the corresponding paper that one can do feature selection and technically, remake the reference. However, the pairwise-specific CpG calculation was not so clear for me. Could you elaborate a bit on exactly how you did it? that way if you do not have a script readily available, I can actually prepare it myself?

No fraction of tissue found in any samples...

Hi,

running deconvolve.pl with prostate samples (benign and tumour, and cfDNA) shows no prostate in any. I also tested a tumour/normal sample from TCGA_PRAD (far right of image, prad_1, 2), which shows same profile. What is your take on this? NB I tried a few lung, colon samples and they seem to work, possibly this is prostate specific issue?

Thanks,

Bruce.

1359_Perry_EPIC Roadmap450K TCGA_PRAD ra_deconv_plot

output.csv from the preprocessing script

Hi Netanel
Great work on this!! Is there a possibility to share the output csv from the data in the publication. Basically, I'm asking for the signature matrix that is not filtered by differential methylation etc.

Best,
Altuna

Failed to replicate results SRX175350

I downloaded SRR641640 run of SRX175350 as a fastq file from NCBI website.
I then used the below set of commands to convert the fastq file to pat format:

  1. Index the hg19.fa file
    bwa index -a bwtsw hg19.fa
  2. Generate a .sam file using reference and input fasta/fastq files:
    bwa mem reference_genome.fasta input.fastq > output.sam
  3. Convert the output.sam file to sorted bam file:
    samtools sort -o sorted_output.bam output.sam
  4. Index the sorted_output.bam file:
    samtools index sorted_output.bam
  5. Convert the .bam file to .pat.gz format:
    wgbstools bam2pat sorted_output.bam
    The pat file I got should ideally contain all the entries of the pat file provided in the tutorial section of UXM tools for SRX175350. But that was not the case and therefore I also got different results on deconvoluting the pat file.

Attaching the given pat file and output (files starting with name Given_) in the tutorial for SRX175350 and the ones I got (files starting with MyExp_) with SRR641640 run of SRX175350
Given_Lung_STL002.pat.gz
Given_Lung_STL002.pdf
MyExp_Lung_STLOO2.pdf
MyExp_Lung_STLOO2.pat.gz

implementation of WGBS mode

Hi,
Thank you for the awesome tool and reference. I'm enjoying it greatly.
By the way, are you interested in WGBS mode? deconvolve.py refuses the first column other than cgxxxx. How about adding --wgbs option which accepts first columns which recode chromosome and position (e.g. chr14:68790171 instead of cg08169020 or use first two columns for chrom and pos)?
If you like it, I would PR.
Best, Yoshiaki

Can I use recent Infinium arrays (EPICv2 arrays) in pre_process?

Hello, thank you for making such a good tool. First, I looked it up, but... I'm sorry to ask you this question because I can't find a way.

I used idat files in pre_process using the script and reference sample you uploaded in the pre_process, but as the EPIC array has been upgraded, I think some information has been added to the probe ID. So, my idat files cannot work pre_process. Is there any way to use the idat file made by EPIC array version 2 for pre_processing?

Thank you.

duplicated rows in the reference_atlas.csv

Some rows are duplicated in the reference_atlas.csv file. For example, the probe cg19442545 shows up 5 times with the same beta values across all the samples. Is it expected?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.