Giter Club home page Giter Club logo

Comments (6)

epruesse avatar epruesse commented on July 21, 2024

When I do this it seems to progress as expected up until alignment of the 16th sequence, at which point it aborts with the following error message:

I'm assuming you meant 16th file. The loop you posted should not affect SINA in any way at all. Does that file fail if run directly? Or really only in the loop?

Time for alignment phase: 41.081814s
Terminating PT server…
ARB_PT_SERVER: received shutdown message

That's not an error, but should always be the last bit. You can in theory start an ARB PT server on your own, and point SINA to the server using "--pt-port" (and "--search-db-port"), to save on startup time with small files. If you don't, SINA will start one itself and terminate it once SINA is finished. That's the output you are seeing, SINA saying Terminating PT server and the PT server than saying received shutdown message.

If the file is empty, that may just mean that nothing in there was sufficiently similar to 16S to even have an alignment. Try without the classifier, that should get you more results.

I tried using --search-all within the loop and that worked fine, but was too slow. I’d like to run the loop with the PT server, so any suggestions would be much appreciated!

Yes, that's more of a debug feature. SINA will use a k-mer heuristic to find the most similar sequences (top 1000 by default) and then uses the alignment to compute a score on those. With --search-all it will check each input sequence against each reference sequence, which with a big database just takes forever indeed. It doesn't gain you much either. You can test by increasing the output from the heuristic and watching the results (not) change (--search-kmer-candidates 10000 shouldn't get you much else than the default, and --search-kmer-candidates 100 should only see a minor benefit on performance).

from sina.

larusnz avatar larusnz commented on July 21, 2024

No, I was meaning the 16th sequence in the first file (which is why it seemed very strange). I tried running the file directly and it worked fine, it only aborts early when in the loop.

I'll try a few other things and see if I can resolve what it going on - thanks

from sina.

epruesse avatar epruesse commented on July 21, 2024

Ok. Please close this if you figure out what went wrong. It does sound to me like SINA terminated normally after 16 sequences. Perhaps the command line wasn't exactly the same (forgotten \ at the end of a line in your script or something similar).

from sina.

larusnz avatar larusnz commented on July 21, 2024

Yes, I see now, you are correct - the loop ran file 10 (containing 16 sequences) before file 1 (containing 600 sequences) . However, the PT Server terminated at the end of running the first file, so the loop failed. Is there any way to run multiple files without the Server terminating?

from sina.

epruesse avatar epruesse commented on July 21, 2024

However, the PT Server terminated at the end of running the first file, so the loop failed.

No. The PT server terminated, as did SINA, because they were finished. That was not an error. Put echo SINA exited with code $?; into your loop to have bash print the exit code, it should be 0.

Is there any way to run multiple files without the Server terminating?

Quoting myself from above:

That's not an error, but should always be the last bit. You can in theory start an ARB PT server on your own, and point SINA to the server using "--pt-port" (and "--search-db-port"), to save on startup time with small files. If you don't, SINA will start one itself and terminate it once SINA is finished. That's the output you are seeing, SINA saying Terminating PT server and the PT server than saying received shutdown message.

However, it's not really necessary.

Here's a script for running many instances of SINA in parallel on a single large file: https://github.com/epruesse/SINA/blob/master/src/psina

Works just fine for me (if you use that, watch out for memory, no more than one thread per 16GB if you use e.g. the SILVA DB - PT server is quite memory hungry).

from sina.

larusnz avatar larusnz commented on July 21, 2024

Thanks!

from sina.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.