Giter Club home page Giter Club logo

Comments (20)

apetkau avatar apetkau commented on August 11, 2024

Hello @PHemarajata, this isn't currently supported in the Galaxy wrapper but I will look into adding this in the future. I can let you know how to do it by hand, but this requires you to run some commands to update the databases in the conda environment used by Galaxy.

from staramr.

apetkau avatar apetkau commented on August 11, 2024

Specifically you could activate the conda environment for staramr conda activate [email protected] and then run staramr db update -d to update to the latest revisions.

However, note that any new genes added (or modified) will show up as unknown in the phenotype/drug resistance columns of the output since they link between genes and drugs have not been defined in these cases.

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

I have a different question. In the Galaxy version we can modify settings of "Percent identity threshold for BLAST" and "Percent length overlap of BLAST hit for ResFinder database". How do you configure these in the command line version?

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

Sorry I figured out myself.
--pid-threshold for Percent identity threshold for BLAST
--percent-length-overlap-resfinder for ercent length overlap of BLAST hit for ResFinder database

Another question. Can we use the --recursive option?

from staramr.

apetkau avatar apetkau commented on August 11, 2024

I'm glad you were able to figure it out yourself.

What do you mean by --recursive? Do you mean recursively scan directories for input files?

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

@apetkau Thank you for your reply.

Do you mean recursively scan directories for input files?

Yes. Good to have that option when we are dealing with a lot of input files.

from staramr.

apetkau avatar apetkau commented on August 11, 2024

@suzukimicro okay, I understand. There is no built-in way to recursively search for input files in a directory, but if you have the correct shell you can make use of shell wildcards to do exactly what you want. In particular, you can use the globstar option in bash.

First you need to enable globstar:

shopt -s globstar

Then, to search for all files ending in .fasta under the directory input you can do:

staramr search -o output input/**/*.fasta

Here your shell (bash) will expand input/**/*.fasta to any files ending in .fasta within the input/ directory and all sub-directories. You can pass multiple patterns as input to staramr, for example:

staramr search -o output input-1/**/*.fasta input-2/**/*.fasta

Other shells may have their own recursive wildcard expansions available. Alternatively you could likely use the find command to recursively search for fasta files and pass to staramr:

staramr search -o output `find input/ -iname '*.fasta' -printf '%p '`

This will search under the input/ directory for any files ending in .fasta and print out the names separated by a space (the -printf '%p ' part). This list of files is then passed to staramr.

I hope this helps you out.

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

@apetkau Thank you for your kind help.

I could recursively input fasta files in folders, but somehow the resulting output folder included only hits, but no others such as results.xlsx and summary.tsv. I simply did the following command. Can you find the possible reasons?

staramr search -o output *.fasta

from staramr.

apetkau avatar apetkau commented on August 11, 2024

Which version of staramr are you using (staramr --version)? And was there any error messages printed out?

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

@apetkau Thank you for your reply.
I use the latest version of staramr (v0.7.1) on Terminal.app of macOS Catalina v10.15.6. If there were within three fasta files, it was no problem, however, if there were more than four fasta files, no files except for the hits folder was output. No errors other than 'Predicted Phenotype' appeared to be displayed. Are there any possible causes?

from staramr.

apetkau avatar apetkau commented on August 11, 2024

@suzukimicro would it be possible to post the messages you do get here? I am uncertain as to what's going on.

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

@apetkau Thank you for your reply.
I have found that if a fasta file that makes 'Predicted Phenotype' error is included for calculation, all the results were not output except for the hits folder. Do you have any solutions?

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

@apetkau Thank you for your reply.
I use the latest version of staramr (v0.7.1) on Terminal.app of macOS Catalina v10.15.6. If there were within three fasta files, it was no problem, however, if there were more than four fasta files, no files except for the hits folder was output. No errors other than 'Predicted Phenotype' appeared to be displayed. Are there any possible causes?

The fourth file made 'Predicted Phenotype' error.

from staramr.

apetkau avatar apetkau commented on August 11, 2024

@suzukimicro when you say a 'Predicted Phenotype' error do you mean this error #115? If so, then the solution is to change the version of the pandas library (version 0.25.3 worked for me). Changing the version depends on how you installed staramr. If you used conda it would be conda install pandas==0.25.3. If it was with pip it would be pip install pandas==0.25.3.

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

@apetkau Thank you for your reply.
The 'Predicted Phenotype' error could be avoided after updating the pandas library. But, I faced other error as follows.
ERROR: Index length not within range, for [rep21_24_rep(CN1_plasmid2)_NC_022227]
If a file that makes this error was included in calculation, only the hit folder was output. Do you have any solutions?

from staramr.

apetkau avatar apetkau commented on August 11, 2024

That's great news @suzukimicro, getting closer to having all the files work.

For the Index length error, I suspect it could be related to windows newlines in the file (which uses two characters instead of one character to represent a newline) - https://en.wikipedia.org/wiki/Newline#Issues_with_different_newline_formats

You should be able to check for this by using the file command on the file:

file file.fasta

You can convert to unix-style line endings with dos2unix:

dos2unix file.fasta

However, I don't know if this programs exist by default on a mac. dos2unix may have to be installed via homebrew (https://formulae.brew.sh/formula/dos2unix).

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

@apetkau Thank you for your reply.
Using my MacBook Pro, I have converted fsa files in staramr/databases/data/update/plasmidfinder (just in case) and fasta files (for calculation) to Unix format by dos2unix, but still faced this error after running staramr..
ERROR: Index length not within range, for [rep21_24_rep(CN1_plasmid2)_NC_022227]
There would be a problem with some fasta files for calculation?

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

The fasta files (for calculation) were shown as "ASCII text" by file.

from staramr.

apetkau avatar apetkau commented on August 11, 2024

@suzukimicro I'm not really sure what the issue is then. If you look at the rep21_24_rep(CN1_plasmid2)_NC_022227 entry in the fasta file, is there any sequence data associated with it? Or is the sequence data stored on lines that are longer than other lines? Is there any special characters in the sequence data? Or maybe there's issues with the ( and ) symbols in the sequence name.

Without seeing the fasta file I don't think I can give much more for answers.

from staramr.

suzukimicro avatar suzukimicro commented on August 11, 2024

@apetkau Thank you for your reply.
I will try to find out which fasta file(s) made the error. About 12,000 fasta files were being analyzed then.. When I analyzed them with staramr in Galaxy, I was able to perform without problems.

from staramr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.