Giter Club home page Giter Club logo

Comments (12)

KatharinaHoff avatar KatharinaHoff commented on August 13, 2024 2

Dear @exnx , thank you so much for trying to fix it. Sadly, the pip install of git-lfs did not fix the problem. Sorry, my bad, I had not tested my first suggestion. I have now figured out how to fix it. Possibly not the most elegant way... but if you append the following to your Dockerfile, then both the docker and the singularity built contain git lfs and models can be loaded from huggingface:

RUN wget https://github.com/git-lfs/git-lfs/releases/download/v3.4.0/git-lfs-linux-amd64-v3.4.0.tar.gz && \
    tar -xvf git-lfs-linux-amd64-v3.4.0.tar.gz && \
    cd git-lfs-3.4.0 && \
    ./install.sh && \
    cd .. && \ # I have not tested the last two lines but I think it makes sense to delete the archive; my built still has it
    rm git-lfs-linux-amd64-v3.4.0.tar.gz

This solution, I have tested both with Docker and Singularity. git lfs works.

Maybe you also want to add the Singularity instructions (adapt from my initial post here) to the Readme.md? Just an idea to save other people some time. I tested it with singularity-ce version 3.11.3 , all works well.

from hyena-dna.

salvatoreloguercio avatar salvatoreloguercio commented on August 13, 2024 1

No worries, thank you for all the help. I think I found the culprit - my cloned hyenaDNA folder was on a mounted drive (/mnt/ etc.) that for some reason wasn't accessible by the container image. Now moved everything on my home folder and it seems to work.
If I have further questions I will reach out on Discord. Thanks again!

from hyena-dna.

exnx avatar exnx commented on August 13, 2024

Hi @KatharinaHoff, thanks so much! I forgot this dependency, which I believe is just for loading the pretrained weights from Huggingface. Good catch :) I've made the change and uploaded a new Docker image (reflected in the readme). You can now pull with (I removed the 'public' name):

docker pull hyenadna/hyena-dna:latest

Enjoy!

from hyena-dna.

exnx avatar exnx commented on August 13, 2024

Thanks for the update! I haven't been able to test it out myself, but I'll report back when I do.

from hyena-dna.

exnx avatar exnx commented on August 13, 2024

I ended making a second Docker image with the Nucleotide Transformer datasets, and weights to reproduce the results from our paper. This new image includes the correct git-lfs dependency for pulling in weights from Huggingface. You can find the image here:

# pull image
docker pull hyenadna/hyena-dna-nt6:latest 

# run container
docker run --gpus all -it -p80:3000 hyenadna/hyena-dna-nt6 /bin/bash

To build the image, I used tips from this thread, which basically just means adding:

RUN curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
RUN sudo apt-get install git-lfs

Eventually I'll add this to the main Dockerfile in the repo, but for now there are 2 Docker images.

from hyena-dna.

salvatoreloguercio avatar salvatoreloguercio commented on August 13, 2024

Hi @KatharinaHoff and @exnx , thanks a lot for posting directions on how to generate and use a Singularity image of HyenaDNA! I tried as Katharina suggested:

singularity build hyena-dna.sif docker://hyenadna/hyena-dna-public:latest
git clone https://github.com/HazyResearch/hyena-dna.git
cd hyena-dna
SINGULARITYENV_CUDA_VISIBLE_DEVICES=1 singularity exec --nv ~/images/hyena-dna.sif python -m train wandb=null experiment=hg38/genomic_benchmark_scratch

But getting /opt/conda/bin/python: No module named train
Anything missing on my side?
Thanks!

from hyena-dna.

exnx avatar exnx commented on August 13, 2024

The steps above by Katharina didn't work for me. Instead I used a different set of commands, which you can find in this image instead, on dockerhub. You can find the steps in the readme. I'm not especially familiar with Singularity, but there's a different step when starting the image that tunnels your local directory to the container, otherwise you're just getting the environment, but not any of the code.

hyenadna/hyena-dna-nt6:latest

from hyena-dna.

salvatoreloguercio avatar salvatoreloguercio commented on August 13, 2024

from hyena-dna.

exnx avatar exnx commented on August 13, 2024

The nt6 image should be using commands from here. I forgot if nt7 did too or not, might've been testing other things.

But specifically, the commands you want in the Dockerfile are:

RUN curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | bash
RUN apt-get install -y git-lfs

I updated the Dockerfile in this repo, it now includes this command if you build your own image.

from hyena-dna.

salvatoreloguercio avatar salvatoreloguercio commented on August 13, 2024

Thanks! I re-run with hyena-dna-nt6, getting:

singularity build hyena-dna_nt6.sif docker://hyenadna/hyena-dna-nt6:latest
cd hyena-dna
singularity exec --nv hyena-dna_nt6.sif python -m train wandb=null experiment=hg38/genomic_benchmark_scratch
13:4: not a valid test operator: (
13:4: not a valid test operator: 510.47.03
/usr/bin/python: No module named train

This strange 'not a valid test operator' is the same I was getting with the sif image of hyena-dna-nt7 actually.

Wish I could use Docker - but I am stuck with Singularity on the HPC A100 nodes I have available.

from hyena-dna.

exnx avatar exnx commented on August 13, 2024

ChatGPT? ie, check for how to make the docker cmd provided into an equivalent singularity cmd. Sorry, we don't support singularity on our end, we just don't use it.

from hyena-dna.

exnx avatar exnx commented on August 13, 2024

eg

apptainer pull docker://hyenadna/hyena-dna-nt7:latest
apptainer exec --nv docker://hyenadna/hyena-dna-nt7:latest /bin/bash

from hyena-dna.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.