Giter Club home page Giter Club logo

Comments (9)

shadowwalkersb avatar shadowwalkersb commented on August 27, 2024 1

Please, see the list used to build eman-deps, https://github.com/cryoem/eman-deps-feedstock/blob/0a4718059822ff88bd5778249f4bdb7555e5175f/recipe/meta.yaml#L11-L41. This link, now, is available in the wiki instructions for source builds.

from eman2.

samfux84 avatar samfux84 commented on August 27, 2024

Thank you, this is exactly what I was looking for.

from eman2.

sludtke42 avatar sludtke42 commented on August 27, 2024

from eman2.

samfux84 avatar samfux84 commented on August 27, 2024

@sludtke42: Thank you for your reply.

I can fully understand that Anaconda simplifies the installation of software for many users. But I still think that it is important to provide the list of dependencies for people that do not use anaconda. There is no need to provide installation instructions for any of the dependencies, but without knowing what a software depends on, it might be quite difficult to install it.

I work as application specialist managing more than 300 applications and libraries on the HPC cluster of our university (we use the SPACK package manager, https://spack.readthedocs.io/ for installations).

I prefer to not use Anaconda as a simple installation can easily contain 100'000 small files, which is not optimal for high-performance file systems that are optimized for big files.

I would like to add some comments regarding some of your points A)-G)

A) Our users have a quota of 100'000 files/directories for their home directory (because there is a nightly backup running for ca. 2500 users, and before we introduced the quota, the backup was not finishing overnight because some users had several million files in their home). An Anaconda (or even a Miniconda) installation easily exceeds the quota and then the users complain that they can no longer write new files to their home directory.

Therefore this is not really helping the cluster users as long as Conda installations contain that many files.

B) We already provide a large number of centrally installed packages, therefore no need for the users to do redundant installations in Anaconda:

https://scicomp.ethz.ch/wiki/Leonhard_applications_and_libraries
https://scicomp.ethz.ch/wiki/Euler_applications_and_libraries

E) For this purpose, we use SPACK, https://spack.readthedocs.io

G) On a HPC cluster, install size is not too important as storage space has become very cheap. What matters on HPC file systems is the number of files, as random I/O with a large number of files kills the performance of every HPC file system.

from eman2.

sludtke42 avatar sludtke42 commented on August 27, 2024

from eman2.

samfux84 avatar samfux84 commented on August 27, 2024

@sludtke42: Thank you for your reply and for taking the time for this discussion.

Please do not get me wrong, I don't want you to change anything with regards to EMAN2 (except that the list of dependencies is published on the wiki, which is meanwhile the case).

Not my business to get into your institutions cluster management policies, but I'll say that in many fields, like genomics/bioinformatics, a large number of small files is simply how everything is configured to operate. There are cluster configuration strategies and policies that can deal with this sort of thing, but everyone's user base is different.

I think that there is a misunderstanding and I am sorry for not being more clear about this. The 100'000 files/directories quota only applies to home directories. For guest users on our cluster, this is the only permanent storage they have. For research groups that invested into our HPC cluster, there are other file systems (NetApp, Lustre) where they can have hundreds of terabytes of data and millions of inodes.

My point with regards to the quota is just, that on many clusters, guest users have limits and if the software that they would like to use is not installed centrally, they can hardly install anything in their home directory that requires a Conda installation for the dependencies. For everybody else I don't see any problem.

Best regards

Sam

from eman2.

sludtke42 avatar sludtke42 commented on August 27, 2024

from eman2.

Icecream-blue-sky avatar Icecream-blue-sky commented on August 27, 2024

Please, see the list used to build eman-deps, https://github.com/cryoem/eman-deps-feedstock/blob/0a4718059822ff88bd5778249f4bdb7555e5175f/recipe/meta.yaml#L11-L41. This link, now, is available in the wiki instructions for source builds.
Hi,shadowwalkersb!
Are these dependencies complete? Why I follow https://blake.bcm.edu/emanwiki/EMAN2/COMPILE_EMAN2_ANACONDA using conda create -n eman2 eman-deps-dev -c cryoem -c defaults -c conda-forge to install eman2 dependencies on windows(just ot know what dependencies are in eman2, then manually install them on linux), I found that the dependencies is different from the list you provide(https://github.com/cryoem/eman-deps-feedstock/blob/0a4718059822ff88bd5778249f4bdb7555e5175f/recipe/meta.yaml#L11-L41)?
image

from eman2.

shadowwalkersb avatar shadowwalkersb commented on August 27, 2024

The link on the Wiki is the correct one.

from eman2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.