dnasko / biobin Goto Github PK
View Code? Open in Web Editor NEWCentral repository for bioinformatic software and code
Central repository for bioinformatic software and code
Hey Dan,
Was using the mga2fasta.pl
script and I realized that if the input fasta file has duplicated sequence IDs, the script will fail or give an incorrect answer.
Here is what is going wrong (I think):
This line is taking just the ID part of the header
biobin/conversion/mga2fasta.pl
Line 125 in 670e3f1
so the $FASTA
hash table is keyed on just that part.
So if you have multiple sequences with the same ID, only one of them will be in the hash table.
Then, you'll either get an error in the get_nt_orf
function because the length of the sequence is not matching up with the mga output files and so the substr
function fails here
biobin/conversion/mga2fasta.pl
Line 235 in 670e3f1
Then that function will return an empty $orf
, which will then cause the gc
function to get a divide by zero error here:
biobin/conversion/mga2fasta.pl
Line 217 in 670e3f1
That was kind of a confusing example, but hopefully you get what I'm saying.
Hi there,
This is a semi-automated message from a fellow bioinformatician. Through a GitHub search, I found that the following source files make use of BLAST's -max_target_seqs
parameter:
Based on the recently published report, Misunderstood parameter of NCBI BLAST impacts the correctness of bioinformatics workflows, there is a strong chance that this parameter is misused in your repository.
If the use of this parameter was intentional, please feel free to ignore and close this issue but I would highly recommend to add a comment to your source code to notify others about this use case. If this is a duplicate issue, please accept my apologies for the redundancy as this simple automation is not smart enough to identify such issues.
Thank you!
-- Arman (armish/blast-patrol)
Try running:
./biobin/pcr_products/adapter_searching/demultiplex.pl \
-fasta 0-inputs/2nd_batch_anotop_rtpr_fullpasses_3_accuracy_98_length_250_5000.txt \
-adapter 0-inputs/forward_primers.txt \
-window 40 \
-identity 0.85 \
-outdir 1-demultiplex/ \
--trim
And you'll find:
substr outside of string at ./biobin/pcr_products/adapter_searching/demultiplex.pl line 529.
Use of uninitialized value in subroutine entry at /home/wommacklab/library/lib64/perl5/String/Approx.pm line 254.
Use of uninitialized value in subroutine entry at /home/wommacklab/library/lib64/perl5/String/Approx.pm line 254.
Use of uninitialized value $sequence in substr at ./biobin/pcr_products/adapter_searching/demultiplex.pl line 529.
substr outside of string at ./biobin/pcr_products/adapter_searching/demultiplex.pl line 529.
Use of uninitialized value in subroutine entry at /home/wommacklab/library/lib64/perl5/String/Approx.pm line 254.
Use of uninitialized value in subroutine entry at /home/wommacklab/library/lib64/perl5/String/Approx.pm line 254.
Error! Unable to trim this sequence: m181205_052558_42157_c101475712550000001823318702141957_s1_p0/103922/ccs
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.