Giter Club home page Giter Club logo

exogene's People

Contributors

zstephens avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

tgjohnst

exogene's Issues

Interpret result

Hi.
I executed exogene on WGS data. I think final output file is "integration.tsv".
I show my example output.
image
image

The results show many integration site for Encephalomyocarditis virus.
I think it is false positive because this sample is liver cancer. Perhaps HBV is true.

How filter out the false positive??
And, the results show many SOFTCLIP_MAPQ is 0%. is it ok?

Thanks.

Custom viral fasta must also be indexed

Thanks so much for making this workflow available and dockerizing it!

When testing it with a custom viral genome file (-v), I noticed that the workflow would run, but I saw a suspicious early [E::bwa_idx_load_from_disk] fail to locate the index files message and the rest of the run would continue and eventually fail to find any integration sites.

It turns out this was due to the viral fasta I was supplying not having been indexed with bwa index (init_ref.sh indexes the joint reference but not the viral one alone) since it is used as the target of the initial mapping step (assumedly your included reference is already indexed). This is easy enough to do but took a while to figure out because there's no documentation suggesting that this file needs to be indexed in the README and I was trying to figure out if the joint indexing had failed.

As far as solutions, I was thinking of either:

  1. Including a note in the README that custom viral genome files must be indexed with bwa index (this wouldn't require any repackaging of the docker container)
  2. Adding a behavior to init_ref.sh that also indexes the supplied viral reference fasta with samtools and bwa if the -v flag is specified. If you'd prefer this not be the default behavior, there could be an additional commandline flag to enable it, or a check for a matching bwa index file with appropriate suffix so it's not reindexed if those files already exist.

Cheers!
Tim J

"No such file or directory" error when searching Test_data

Hi Stephen, I am not experienced Docker user and couldn't solve the issue by myself. I am running Docker on WSL (Cygwin and Ubuntu). After running exogene in Docker, I use this command as you described:
./Exogene-SR.sh -f1 test_data/SRR3104446_1.fq.gz -f2 test_data/SRR3104446_2.fq.gz -r refs/HumanViral_Reference_12-12-2018.fa -o output

Even though the path is true, gzip can't open or read the files:

gzip: test_data/SRR3104446_1.fq.gz: No such file or directory
gzip: test_data/SRR3104446_2.fq.gz: No such file or directory
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: /usr/bin/bwa mem -k 30 -t 4 /home/refs/HumanViral_Reference_12-12-2018.fa -
[main] Real time: 0.018 sec; CPU: 0.020 sec
Traceback (most recent call last):
File "/home/exogene/dev/readlist_2_fq.py", line 38, in
fi_1 = get_file_handle(IN_R1, 'r')
File "/home/exogene/dev/readlist_2_fq.py", line 17, in get_file_handle
return open(fn, rw)
IOError: [Errno 2] No such file or directory: 'test_data/SRR3104446_1.fq'
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] EOF marker is absent. The input is probably truncated.
We were unable to grab template length stats from bwa.log, making a complete guess...
estimated template length: 350 50
=== BREAKPOINT DEVIATIONS:

CHR INTEGRATION_POS #READS VIRUS ANNOTATION SOFTCLIP_POS #SOFTCLIP D
ISCORDANT_POS #DISCORDANT LONGREAD_POS #LONGREAD NEAREST_GENE

mv: cannot stat '*_hits.ids': No such file or directory

Thank you for your time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.