Giter Club home page Giter Club logo

Comments (10)

jeffkimbrel avatar jeffkimbrel commented on August 20, 2024

I see this is similar to #11 . I'll close and wait to see if the test_suite.py gets fixed in the future.

from drep.

MrOlm avatar MrOlm commented on August 20, 2024

Hi Jeff,

Sorry about this inconvenience. All that's needed is a blank folder in the tests folder called test_backend.

You can either make it yourself, or download the new version (1.1.2) which will include it.

Best,
-Matt

from drep.

jeffkimbrel avatar jeffkimbrel commented on August 20, 2024

Thanks, that worked to get it started. I did get an error, however:

Traceback (most recent call last):
  File "test_suite.py", line 633, in <module>
    test_short()
  File "test_suite.py", line 617, in test_short
    cluster_test()
  File "test_suite.py", line 581, in cluster_test
    verifyCluster.run()
  File "test_suite.py", line 328, in run
    self.skipsecondary_test()
  File "test_suite.py", line 367, in skipsecondary_test
    assert compare_dfs(db1, db2), "{0} is not the same!".format('Mdb')
AssertionError: Mdb is not the same!

It isn't clear to me at which calculation this failed, but it was step4 after functional test 1 passed.

Let me know if I should open this up in a new issue.

Jeff

from drep.

MrOlm avatar MrOlm commented on August 20, 2024

from drep.

jeffkimbrel avatar jeffkimbrel commented on August 20, 2024

Yep, it looks like there are dependencies missing, but the missing ones are listed as optional. I suppose that the test python script uses these dependencies?

$ dRep bonus test --check_dependencies
Loading work directory
Checking dependencies
mash.................................... all good        (location = /usr/local/bin/mash)
nucmer.................................. all good        (location = .../scripts/MUMmer3.23/nucmer)
checkm.................................. !!! ERROR !!!   (location = None)
ANIcalculator........................... !!! ERROR !!!   (location = None)
prodigal................................ all good        (location = .../scripts/prodigal)
centrifuge.............................. !!! ERROR !!!

from drep.

MrOlm avatar MrOlm commented on August 20, 2024

Hmm- yes some of those dependencies are used by the test suite, but with the ones you have working I wouldn't really expect it to crash there.

Is the program working when you use it on your own data?

from drep.

jeffkimbrel avatar jeffkimbrel commented on August 20, 2024

It seems like I keep running into problems with checkM not installed. I am trying to run all of this on my laptop, and therefore don't have checkM because it requires >16gb of memory. I actually do have it installed, but don't have all of the data files downloaded.

Running on my own files...

$ dRep dereplicate_wf dRep_test -g bins/*.fasta --skipCheckM

When I use the --skipCheckM flag I can get through the Filtering and Clustering steps, but it fails on the Choose step after it also attempts to run checkM (disregarding the flag).

Also, I manage python environments using Anaconda rather than pyenv. My default python is 3.4.5. I wonder if that is also messing something up... I think checkM is the only thing requiring python 2.X, correct?

Thanks for your help.

from drep.

MrOlm avatar MrOlm commented on August 20, 2024

from drep.

jeffkimbrel avatar jeffkimbrel commented on August 20, 2024

I have figured out how to run checkM on NERSC... is it possible to use an "external" checkM results dataset with dRep?

My goals are pretty much what the advertised purpose is. I have tons of metagenomes that would be too computationally expensive to combine and co-assemble. So I want to take bins from either single metagenomes, or co-assembled replicates, and "merge" the bins.

from drep.

MrOlm avatar MrOlm commented on August 20, 2024

Yes- there is a way to use "external" checkM results.

When using the dereplicate_wf (probably what you want), there's an option --Chdb which can be used to provide external checkM results.

They need to be in the --table_table format, though. The checkM command to generate this is:

checkm qa --tab_table -o 2

An example of how it should look is attached.

Finally, make sure that for the "Bid Id" column, you have the name of the genome WITH the file extension, and WITHOUT the path to the genome (as is the case in the example sheet provided).

Best,
-Matt

Chdb.csv.zip

from drep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.