This script remove redundancy of blast output, based on qseqid, and a range of qstart and qend, the match with best bitscore is maintained, besides that, a column called sense its created to indicates the DNA sense of match.
This script was build on python 3.6.5+ and have these dependencies:
To use conda enviroment:
conda env create -f blast_filter.yml
conda activate blast_filter
- python blast_filter.py -in blast_output
- I'm not a computer engineer or some related professional, I'm just write this script to study python and to automatize some bioinformatics tasks. So fell free to commit changes that makes the code more efficient or more clean.
- This script will continue to be developed to englobe others functions, such as filter by subject.