phac-nml / staramr Goto Github PK
View Code? Open in Web Editor NEWScans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
License: Apache License 2.0
Scans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
License: Apache License 2.0
Add documentation to our user guide and update the tutorial to include the use of MLST results.
Set the --pointfinder-organism based on results of organism id instead of relying on the user to provide this information on the command-line.
The Excel spreadsheet looks great! The freezing of panes is nice and so is having all the parameters used to run staramr
under its own sheet.
I would recommend one enhancement: auto-fitting the column widths to the contents of each column for readability (and since users will likely do it themselves anyway).
So after you create a worksheet in the workbook, iterate through each column in the df
and adjust the corresponding column's width in the worksheet:
df.to_excel(writer, sheet_name=fixed_sheetname, **pd_to_excel_kwargs)
worksheet = writer.book.get_worksheet_by_name(fixed_sheetname)
for i, width in enumerate(get_col_widths(df, index=args.write_index)):
worksheet.set_column(i, i, width)
where get_col_widths
is:
def get_col_widths(df, index=False):
"""Calculate column widths based on column headers and contents"""
if index:
idx_max = max([len(str(s)) for s in df.index.values] + [len(str(df.index.name))])
yield idx_max
for c in df.columns:
# get max length of column contents and length of column header
yield np.max([df[c].astype(str).str.len().max(), len(c)])
It's not perfect (assumes uniform character length), but might save users some time.
Currently, right now we don't have have a predefined standard for how Staramr is maintained or how someone can contribute to this project. I propose implementing a contribution guide that contains some of the following:
--verbose
, logging
, etcAdd the Sequence Type and Organism to the Summary sheets.
Only salmonella is supported from the PointFinder database. Add support for all other organisms - https://bitbucket.org/genomicepidemiology/pointfinder_db/src
Invert the behaviour of --include-negatives
to include negative matches to the ResFinder/PointFinder databases in the final report by default. Add an option --exclude-negatives
instead.
Using an existing set of genomes along with plasmid results obtained from the PlasmidFinder webserver, let's compare these to the results generated from staramr
.
When running with output to single files --output-summary summary.tsv
I get the following exception:
2019-04-12 11:00:20,475 ERROR: 'Namespace' object has no attribute 'output_detailed_summary'
Traceback (most recent call last):
File "/home/CSCScience.ca/apetkau/workspace/staramr/bin/staramr", line 68, in <module>
args.run_command(args)
File "/home/CSCScience.ca/apetkau/workspace/staramr/staramr/subcommand/Search.py", line 375, in run
output_detailed_summary = args.output_detailed_summary
AttributeError: 'Namespace' object has no attribute 'output_detailed_summary'
I think an argument for --output-detailed-summary
is missing.
Hi,
I installed staramr in Galaxy on a native Ubuntu, and got an error trying after using a Shovill-assembled configs (in a multi-fasta file). I was wondering what I did wrong and would appreciate any suggestion. Thank you!
Fatal error: Exit code 1 () 2018-06-10 14:51:10,922 INFO: No --pointfinder-organism specified. Will not search the PointFinder databases 2018-06-10 14:51:10,922 INFO: --output-dir not set. Files will be output to the respective --output-[type] setting 2018-06-10 14:51:10,931 INFO: Making BLAST databases for input files 2018-06-10 14:51:10,939 ERROR: Command '['makeblastdb', '-in', '/home/phemarajata/galaxy/database/tmp/tmpuuzos1zh/input-genomes/Shovill on data 11 and data 10: Contigs.fasta', '-dbtype', 'nucl', '-parse_seqids']' returned non-zero exit status 1. Traceback (most recent call last): File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/bin/staramr", line 68, in args.run_command(args) File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/site-packages/staramr/subcommand/Search.py", line 356, in run files=args.files) File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/site-packages/staramr/subcommand/Search.py", line 216, in _generate_results plength_threshold_pointfinder, report_all_blast) File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/site-packages/staramr/detection/AMRDetection.py", line 61, in run_amr_detection self._amr_detection_handler.run_blasts(files) File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/site-packages/staramr/blast/BlastHandler.py", line 96, in run_blasts db_files = self._make_db_from_input_files(self._input_genomes_tmp_dir, files) File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/site-packages/staramr/blast/BlastHandler.py", line 120, in _make_db_from_input_files future_blastdb.result() File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/concurrent/futures/_base.py", line 432, in result return self.__get_result() File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result raise self._exception File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/concurrent/futures/thread.py", line 56, in run result = self.fn(*self.args, **self.kwargs) File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/site-packages/staramr/blast/BlastHandler.py", line 196, in _make_blast_db subprocess.run(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE).check_returncode() File "/home/phemarajata/galaxy/database/dependencies/_conda/envs/[email protected]/lib/python3.6/subprocess.py", line 369, in check_returncode self.stderr) subprocess.CalledProcessError: Command '['makeblastdb', '-in', '/home/phemarajata/galaxy/database/tmp/tmpuuzos1zh/input-genomes/Shovill on data 11 and data 10: Contigs.fasta', '-dbtype', 'nucl', '-parse_seqids']' returned non-zero exit status 1.
Update the documentation in the README.md
to include plasmidfinder. That is, we should add:
staramr
.There are some different approaches to integrate typing information into the staramr
reports.
In this case we can run programs for MLST/organism identification outside of staramr (e.g,. in a Galaxy workflow) and integrate this information into the staramr report in Galaxy.
We can have staramr
run MLST or organism identification (e.g,. Mash) internally, and directly integrate into a report.
We are currently using --threads
in the mlst
program, but this may not be the fastest way. Look at running separate mlst
instances instead of using the --threads
parameter.
There's currently a library called colored logs which is a python library that output colors in the terminal. I think this would be really helpful especially for debugging code in the project.
Example:
Revert ResFinder/PointFinder default databases to previous release versions for this next release. That is, back to versions found in staramr 0.4.0.
My reason for this is our mapping between the AMR gene and drug resistance is only completed for these ResFinder/PointFinder database versions.
This would likely involve disabling support for enterococcus faecalis (#35) since I don't believe this is available in the earlier PointFinder database.
Support will be re-added in a later release.
Hello, having issues with the version 0.2.1. One of my fasta files crashes staramr at the parse results stage. The input assembly file can be located here
(mob_suite) kirill@Discovery20:~/Desktop$ staramr --verbose search --nprocs 2 --pid-threshold 98.0 --percent-length-overlap-resfinder 60.0 --percent-length-overlap-pointfinder 95.0 --output-summary dataset_588.dat --output-resfinder dataset_589.dat --output-settings dataset_590.dat --output-excel dataset_591.dat.xlsx --output-hits-dir staramr_hits "N18.fasta"
2018-08-10 10:29:01,343 INFO Search.run,292: No --pointfinder-organism specified. Will not search the PointFinder databases
2018-08-10 10:29:01,343 INFO Search.run,322: --output-dir not set. Files will be output to the respective --output-[type] setting
2018-08-10 10:29:01,344 DEBUG Search.run,337: Found --output-hits-dir [staramr_hits] and is a directory. Will write hits here
2018-08-10 10:29:01,429 DEBUG BlastHandler.run_blasts,90: Resfinder Databases: ['colistin', 'tetracycline', 'quinolone', 'fusidicacid', 'glycopeptide', 'rifampicin', 'trimethoprim', 'beta-lactam', 'aminoglycoside', 'oxazolidinone', 'macrolide', 'phenicol', 'fosfomycin', 'sulphonamide', 'nitroimidazole']
2018-08-10 10:29:01,430 INFO BlastHandler._make_db_from_input_files,108: Making BLAST databases for input files
2018-08-10 10:29:01,430 DEBUG BlastHandler._make_db_from_input_files,114: Creating symlink from [N18.fasta] to [/var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta]
2018-08-10 10:29:01,431 DEBUG BlastHandler._make_blast_db,200: makeblastdb -in /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -dbtype nucl -parse_seqids
2018-08-10 10:29:01,659 DEBUG BlastHandler.run_blasts,99: Done making blast databases for input files
2018-08-10 10:29:01,660 INFO BlastHandler.run_blasts,102: Scheduling blast for N18.fasta
2018-08-10 10:29:01,663 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.colistin.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/colistin.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:01,670 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.tetracycline.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/tetracycline.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:01,960 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.quinolone.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/quinolone.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,129 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.fusidicacid.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/fusidicacid.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,154 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.glycopeptide.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/glycopeptide.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,170 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.rifampicin.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/rifampicin.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,212 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.trimethoprim.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/trimethoprim.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,259 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.beta-lactam.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/beta-lactam.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,290 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.aminoglycoside.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/aminoglycoside.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,505 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.oxazolidinone.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/oxazolidinone.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,573 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.macrolide.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/macrolide.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,730 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.phenicol.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/phenicol.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,818 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.fosfomycin.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/fosfomycin.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,853 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.sulphonamide.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/sulphonamide.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,889 DEBUG BlastHandler._launch_blast,193: blastn -out /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.nitroimidazole.resfinder.blast.xml -outfmt "6 qseqid sseqid pident length qstart qend sstart send slen qlen sstrand sseq qseq" -query /Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/databases/data/dist/resfinder/nitroimidazole.fsa -db /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/input-genomes/N18.fasta -evalue 0.001
2018-08-10 10:29:02,950 DEBUG BlastResultsParser.parse_results,58: /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.aminoglycoside.resfinder.blast.xml
2018-08-10 10:29:03,372 DEBUG BlastResultsParser.parse_results,58: /var/folders/b7/frgczw4n53xd4nlczrjd3jwc0000gq/T/tmpp1hv_jse/N18.fasta.beta-lactam.resfinder.blast.xml
2018-08-10 10:29:03,405 DEBUG ResfinderHitHSP.__init__,25: record=qseqid blaTEM-108_1_AF506748
sseqid 4
pident 99.414
length 853
qstart 9
qend 861
sstart 39632
send 40484
slen 83930
qlen 861
sstrand plus
sseq TCAACATTTTCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGC...
qseq TCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGC...
plength 99.0708
Name: 108, dtype: object
2018-08-10 10:29:03,425 ERROR staramr.<module>,75: expected string or bytes-like object
Traceback (most recent call last):
File "/Users/kirill/miniconda/envs/mob_suite/bin/staramr", line 68, in <module>
args.run_command(args)
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/subcommand/Search.py", line 356, in run
files=args.files)
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/subcommand/Search.py", line 216, in _generate_results
plength_threshold_pointfinder, report_all_blast)
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/detection/AMRDetection.py", line 65, in run_amr_detection
plength_threshold_resfinder, report_all)
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/detection/AMRDetectionResistance.py", line 36, in _create_resfinder_dataframe
return resfinder_parser.parse_results()
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/blast/results/BlastResultsParser.py", line 61, in parse_results
self._handle_blast_hit(file, database_name, blast_out, results, hit_seq_records)
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/blast/results/BlastResultsParser.py", line 93, in _handle_blast_hit
partitions.append(self._create_hit(in_file, database_name, blast_record))
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/blast/results/BlastHitPartitions.py", line 38, in append
partition = self._get_existing_partition(hit)
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/blast/results/BlastHitPartitions.py", line 56, in _get_existing_partition
partition_name = hit.get_genome_contig_id()
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/site-packages/staramr/blast/results/AMRHitHSP.py", line 101, in get_genome_contig_id
re_search = re.search(r'^(\S+)', self._blast_record['sseqid'])
File "/Users/kirill/miniconda/envs/mob_suite/lib/python3.6/re.py", line 182, in search
return _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object
I did a quick comparison between the staramr output and the output from the CGE's ResFinder web interface. For the most part things matched, but I did find one issue. On the CGE site, I got a hit to aadA1 at bps 22917-23708 with 99.75%. With staramr, ant(3'')-Ia_X02340 was identified in the same region, but extending an extra 176 bps (22917-23884 bp), with 99.49% identity. There appears to something different about the way starmar and CGE are "choosing" their top hit for a given open reading frame. Can anyone give me any insight into this? Percent identity threshold was set at 90% for both tools and I made sure that the same ResFinder database (2019-01-29) was used for both.
Incorporation of plasmids in final summary.tsv
. That is summary.tsv
should look like:
Isolate ID | Genotype | Predicted Phenotype | Plasmid Genes |
---|---|---|---|
SRR1952908 | aadA1, blaTEM-57 | streptomycin | IncX1, IncFIB(S) |
Test the performance of the MLST scheme selection for organism detection. For example, will MLST select the scheme senterica
for only Salmonella enterica? How likely is a mistake to occur?
To assist with diagnosing issues like #20
It would be nice if staramr
could support multiple types of input files (such as Genbank) and also compressed versions of each of this files (e.g., gzipped fasta). As an example, see the description of input for Abricate.
Conversion between different formats can likely use BioPython's SeqIO functionality.
Detection of file formats should also not depend on the extension (e.g., .fasta
for fasta, .gz
for gzipped) since this tool is integrated into Galaxy, which internally names all input files as .dat
. Ideally, the file contents should be used to detect the type of file passed to staramr instead of the extension.
The following error when trying to run staramr
development
branch against a couple genomes:
$ staramr search -o out SRR19529*.fasta
2018-05-14 12:42:27,227 INFO: Scheduling blast for SRR1952908.fasta
2018-05-14 12:42:27,261 INFO: Scheduling blast for SRR1952926.fasta
2018-05-14 12:42:31,589 INFO: Finished. Took 0.07 minutes.
2018-05-14 12:42:31,591 ERROR: 'NoneType' object has no attribute 'to_csv'
Traceback (most recent call last):
File "../staramr/bin/staramr", line 68, in <module>
args.run_command(args)
File "../staramr/staramr/subcommand/Search.py", line 197, in run
self._print_dataframe_to_text_file_handle(amr_detection.get_pointfinder_results(), fh)
File "../staramr/staramr/subcommand/Search.py", line 108, in _print_dataframe_to_text_file_handle
dataframe.to_csv(file_handle, sep="\t", float_format="%0.2f", na_rep=self.BLANK)
AttributeError: 'NoneType' object has no attribute 'to_csv'
It doesn't seem like the pointfinder db is being searched (--verbose
shows only resfinder results being parsed).
Here's the db info
:
$ staramr --verbose db info
resfinder_db_dir = ../staramr/staramr/databases/data/dist/resfinder
resfinder_db_url = https://bitbucket.org/genomicepidemiology/resfinder_db.git
resfinder_db_commit = dc33e2f9ec2c420f99f77c5c33ae3faa79c999f2
resfinder_db_date = Tue, 20 Mar 2018 16:49
pointfinder_db_dir = ../staramr/staramr/databases/data/dist/pointfinder
pointfinder_db_url = https://bitbucket.org/genomicepidemiology/pointfinder_db.git
pointfinder_db_commit = ba65c4d175decdc841a0bef9f9be1c1589c0070a
pointfinder_db_date = Fri, 06 Apr 2018 09:02
pointfinder_gene_drug_version = 111317
resfinder_gene_drug_version = 041318
Doing a fresh staramr db build
after clearing out the existing db doesn't seem to help.
This affects #12 as well.
Let me know if you need any other info!
Add an option --plasmidfinder-database-type [name]
which takes as input enterobacteriacae
or gram_positive
In case of an empty input (or invalid) file, add a better error message than just a stack trace.
Also, add an option like --ignore-invalid-files
to force staramr to skip over invalid input files.
I have never had problems when using staramr, but suddenly I am getting this error message:
2018-06-29 11:45:58,535 ERROR: Command '['makeblastdb', '-in', '/tmp/tmp426ql20e/input-genomes/Patient_2_A.fasta', '-dbtype', 'nucl', '-parse_seqids']' returned non-zero exit status 1
What may I do?
I noticed that the formatting in detailed_summary.tsv
looks like:
Isolate ID | Gene | %Identity | %Overlap | Start | End |
---|---|---|---|---|---|
A | gyrA (S83F) | 99.92399999999999 | 100.0 | 2361282.0 | 2358646.0 |
That is, the %Identity is not being rounded to 2 decimal places, while the Start and End are printed as float
(they should be int
).
I should specify (in docs, possible check in software) which versions of BLAST staramr will work with. I suspect older versions of BLAST have a slightly different output format. I should do a bit of testing to determine the minimum BLAST version required.
Add the ability to detect the MLST type for the input fasta files and integrate this into the staramr report.
Add ability to load and update plasmid databases (particularly, the plasmidfinder database https://bitbucket.org/genomicepidemiology/plasmidfinder_db.git) similar to updating ResFinder/PointFinder databases
When running tests, the following warning is displayed: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
. Remove all read_table calls and replace with read_csv.
When trying to run staramr
with non-existent input:
$ staramr search nofile
2018-05-07 14:00:39,855 INFO: Scheduling blast for nofile
Traceback (most recent call last):
File "../staramr/venv/bin/staramr", line 11, in <module>
load_entry_point('staramr', 'console_scripts', 'staramr')()
File "../staramr/staramr/main.py", line 70, in main
args.run_command(args)
File "../staramr/staramr/subcommand/Search.py", line 170, in run
args.plength_threshold_pointfinder, args.report_all_blast)
File "../staramr/staramr/detection/AMRDetection.py", line 63, in run_amr_detection
resfinder_blast_map = self._amr_detection_handler.get_resfinder_outputs()
File "../staramr/staramr/blast/BlastHandler.py", line 119, in get_resfinder_outputs
future_blast.result()
File "../miniconda3/lib/python3.6/concurrent/futures/_base.py", line 398, in result
return self.__get_result()
File "../miniconda3/lib/python3.6/concurrent/futures/_base.py", line 357, in __get_result
raise self._exception
File "../miniconda3/lib/python3.6/concurrent/futures/thread.py", line 55, in run
result = self.fn(*self.args, **self.kwargs)
File "../staramr/staramr/blast/BlastHandler.py", line 139, in _launch_blast
stdout, stderr = blastn_command()
File "../staramr/venv/lib/python3.6/site-packages/Bio/Application/__init__.py", line 523, in __call__
stdout_str, stderr_str)
Bio.Application.ApplicationError: Non-zero return code 1 from 'blastn -out /tmp/tmpb6o46_2w/nofile.blast.xml -outfmt 5 -query nofile -db ../staramr/staramr/databases/data/dist/resfinder/aminoglycoside.fsa -evalue 0.001', message 'Command line argument error: Argument "query". File is not accessible: `nofile\''
Spaces in Galaxy dataset name will currently cause staramr
to fail. This should be fixed so staramr
in Galaxy can handle spaces in the dataset name.
Related to issue #18
This report should incorporate both resistance genes and plasmids, but unlike summary.tsv
, each resistance gene/plasmid should be on a separate line of the table.
There are some command line options for --pointfinder-commit
and --resfinder-commit
when building the database (https://github.com/phac-nml/staramr#database-build).
We will need to add --plasmidfinder-commit
command in the following file: https://github.com/phac-nml/staramr/blob/development/staramr/subcommand/Database.py
Hi, I was wondering how I can update the databases used by the tool in Galaxy. Thank you!
It looks like the command-line option --exclude-resistance-phenotypes
is not quite working out. This should have the behaviour of excluding the Predicted Phenotype columns, but they are still present for the Summary and Detailed_Summary results.
This command-line option works by selecting whether or not we are using the AMRDetectionSummary.py
or AMRDetectionSummaryResistance.py
classes (in
To fix, you may need to shift some of the code in the AMRDetectionSummary.py class which adds the Predicted Phenotype column down to the subclass AMRDetectionSummaryResistance.py.
Map the MLST scheme to an organism name. So, we would go from senterica to Salmonella enterica.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.