Giter Club home page Giter Club logo

biobin's People

Contributors

dnasko avatar mrbiota avatar

Watchers

 avatar  avatar

biobin's Issues

mga2fasta.pl fails when there are duplicate read IDs in the input fasta file

Hey Dan,

Was using the mga2fasta.pl script and I realized that if the input fasta file has duplicated sequence IDs, the script will fail or give an incorrect answer.

Here is what is going wrong (I think):

This line is taking just the ID part of the header

$h =~ s/ .*//;

so the $FASTA hash table is keyed on just that part.

So if you have multiple sequences with the same ID, only one of them will be in the hash table.

Then, you'll either get an error in the get_nt_orf function because the length of the sequence is not matching up with the mga output files and so the substr function fails here

my $orf = substr $seq, $start, $length;

Then that function will return an empty $orf, which will then cause the gc function to get a divide by zero error here:

my $gc_content = $gcs / length($str);

That was kind of a confusing example, but hopefully you get what I'm saying.

Confirm that use of BLAST's `-max_target_seqs` is intentional

Hi there,

This is a semi-automated message from a fellow bioinformatician. Through a GitHub search, I found that the following source files make use of BLAST's -max_target_seqs parameter:

Based on the recently published report, Misunderstood parameter of NCBI BLAST impacts the correctness of bioinformatics workflows, there is a strong chance that this parameter is misused in your repository.

If the use of this parameter was intentional, please feel free to ignore and close this issue but I would highly recommend to add a comment to your source code to notify others about this use case. If this is a duplicate issue, please accept my apologies for the redundancy as this simple automation is not smart enough to identify such issues.

Thank you!
-- Arman (armish/blast-patrol)

`demultiplex.pl` fails when trimming low identity adapters

Try running:

./biobin/pcr_products/adapter_searching/demultiplex.pl \
-fasta 0-inputs/2nd_batch_anotop_rtpr_fullpasses_3_accuracy_98_length_250_5000.txt \
-adapter 0-inputs/forward_primers.txt \
-window 40 \
-identity 0.85 \
-outdir 1-demultiplex/ \
--trim

And you'll find:

substr outside of string at ./biobin/pcr_products/adapter_searching/demultiplex.pl line 529.
Use of uninitialized value in subroutine entry at /home/wommacklab/library/lib64/perl5/String/Approx.pm line 254.
Use of uninitialized value in subroutine entry at /home/wommacklab/library/lib64/perl5/String/Approx.pm line 254.
Use of uninitialized value $sequence in substr at ./biobin/pcr_products/adapter_searching/demultiplex.pl line 529.
substr outside of string at ./biobin/pcr_products/adapter_searching/demultiplex.pl line 529.
Use of uninitialized value in subroutine entry at /home/wommacklab/library/lib64/perl5/String/Approx.pm line 254.
Use of uninitialized value in subroutine entry at /home/wommacklab/library/lib64/perl5/String/Approx.pm line 254.
 
 Error! Unable to trim this sequence: m181205_052558_42157_c101475712550000001823318702141957_s1_p0/103922/ccs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.