Giter Club home page Giter Club logo

Comments (8)

v-iashin avatar v-iashin commented on July 19, 2024

Hi, @xanthan011

Thanks for submitting the issue.

Google Colab environment is not supported. I decided to check it out because I have seen this error before in #21 that was troublesome to debug without a mwe which, admittedly, you provided. However, again, there is nothing wrong with this code, it is a problem with the environment you are trying to use.

Anyway, the main problem is the lack of disk space on Google Colab. When you run

CAPTION.build_vocab(dataset.caption, min_freq=cfg.min_freq_caps, vectors=cfg.word_emb_caps)

it unpacks the pre-trained glove zip which expands dramatically in size + torchtext resaves it (2.1G -> 5.3G + 2.6G). Since there is no disk space, only a part of the file is saved on the disk. Google Colab shows no error and continues execution.

You need to remove something to fix it.

  1. I removed the second captioning checkpoint (best_cap_model.pt.1) which you are mistakingly downloading for the second time.
  2. I removed the extracted features as you don't need these for the single video example. These features are downloaded and unpacked in

    BMT/download_data.sh

    Lines 1 to 62 in d45ad8f

    # checking if wget is installed on a computer
    if ! command -v wget &> /dev/null
    then
    echo "wget: command not found"
    echo ""
    echo "wget command could not be found on your computer. Please, install it first."
    echo "If you cannot/dontwantto install wget, you may try to download the features manually."
    echo "You may find the links and correct paths in this file."
    echo "Make sure to check the md5 sums after manual download:"
    echo "./data/i3d_25fps_stack64step64_2stream_npy.zip d7266e440f8c616acbc0d8aaa4a336dc"
    echo "./data/vggish_npy.zip 9a654ad785e801aceb70af2a5e1cffbe"
    echo "./.vector_cache/glove.840B.300d.zip 2ffafcc9f9ae46fc8c95f32372976137"
    exit
    fi
    echo "Downloading i3d features"
    cd data/
    wget https://a3s.fi/swift/v1/AUTH_a235c0f452d648828f745589cde1219a/bmt/i3d_25fps_stack64step64_2stream_npy.zip -q --show-progress
    echo "Downloading vggish features"
    wget https://a3s.fi/swift/v1/AUTH_a235c0f452d648828f745589cde1219a/bmt/vggish_npy.zip -q --show-progress
    cd ../
    echo "Downloading GloVe embeddings"
    mkdir .vector_cache
    cd .vector_cache
    wget https://a3s.fi/swift/v1/AUTH_a235c0f452d648828f745589cde1219a/bmt/glove.840B.300d.zip -q --show-progress
    cd ../
    echo "Checking for correctness of the downloaded files"
    i3d_md5=($(md5sum ./data/i3d_25fps_stack64step64_2stream_npy.zip))
    if [ "$i3d_md5" == "d7266e440f8c616acbc0d8aaa4a336dc" ]; then
    echo "OK: i3d features"
    else
    echo "ERROR: .zip file with i3d features is corrupted"
    exit 1
    fi
    vggish_md5=($(md5sum ./data/vggish_npy.zip))
    if [ "$vggish_md5" == "9a654ad785e801aceb70af2a5e1cffbe" ]; then
    echo "OK: vggish features"
    else
    echo "ERROR: .zip file with vggish features is corrupted"
    exit 1
    fi
    glove_md5=($(md5sum ./.vector_cache/glove.840B.300d.zip))
    if [ "$glove_md5" == "2ffafcc9f9ae46fc8c95f32372976137" ]; then
    echo "OK: glove embeddings"
    else
    echo "ERROR: .zip file with glove embeddings is corrupted"
    exit 1
    fi
    echo "Unpacking i3d (~1 min)"
    cd ./data
    unzip -q i3d_25fps_stack64step64_2stream_npy.zip
    echo "Unpacking vggish features"
    unzip -q vggish_npy.zip
    echo "Done"
%%bash
rm -r /content/BMT/data/i3d_25fps_stack64step64_2stream_npy
  1. Run conda clean --all after installing conda environments because conda caches library tarballs after installing packages which allocate lots of space usually. After doing so step 2 should be optional.

There is another error in your code:

conda install -c conda-forge spacy
python -m spacy download en

Likely, it does not do what you expect it to do. The python here is aliased with the Colab's internal default Python library, not conda's so you are installing the language model there. Plus, spacy is already installed in the bmt environment – you only need to install the language model. You can do it as follows instead

%%bash
source activate bmt
/usr/local/envs/bmt/bin/python -m spacy download en

I managed to run the notebook after these fixes on ./sample/women_long_jump.mp4.

I am planning to add a Google Colab notebook to the repo which will support the single video example. If you want to author a PR, please create a .ipynb notebook and I will merge it. If you will decide to do so, please clean and reorganize your current version a bit.


I'm trying to replicate the instructions mentioned in this article
Just to be clear, many improvisations had to be done after following the article to run this repository, but I'm stuck at the very last step and I hope you can help me.

Just to be clear, I didn't write this article you need to improvise upon but I wrote the paper and released this code which that article uses. By the way, I checked your colab file and there is no evidence of the improvisation. It just copies stuff from the README.md which I provided already and tries to stick it to the Google Colab environment.

I EVEN have read your readme file

This is cute!

You need to understand that this code was not designed to run in Google Colab but you are still trying to run it there and ask for my help. I don't have to help you here and even release this repo to the public. Please be polite not only in how you write but also to others' time. Just because you did not manage to debug your work properly, right now you are asking me to do it which shows exactly the opposite.

from bmt.

xanthan011 avatar xanthan011 commented on July 19, 2024

Hello Vladimir,

Firstly, thank your for the response.

Google Colab environment is not supported. I decided to check it out because I have seen this error before in #21 that was troublesome to debug without a mwe which, admittedly, you provided. However, again, there is nothing wrong with this code, it is a problem with the environment you are trying to use.

yes, I knew this before I created a colab file, but on my local machine cuda isn't supported, so I had no choice.

it unpacks the pre-trained glove zip which expands dramatically in size + torchtext resaves it (2.1G -> 5.3G + 2.6G). Since there is no disk space, only a part of the file is saved on the disk. Google Colab shows no error and continues execution. I'm wondering if having Colab Pro would help in this case. Please give your thoughts

So, even I had a 60% guess that this might be the issue as colab did throw a warning saying that the disk is full and I wasn't able to locate the .vector_cache folder.

I removed the second captioning checkpoint (best_cap_model.pt.1) which you are mistakingly downloading for the second time.

Actually this wasn't a mistake. Somehow the best_cap_model.pt file was getting corrupted when I ran the last cell. It threw an error eof expected or corrupt file . Idk why but I thought the best solution was to replace it by downloading a new one.

Just to be clear, I didn't write this article you need to improvise upon but I wrote the paper and released this code which that article uses. By the way, I checked your colab file and there is no evidence of the improvisation. It just copies stuff from the README.md which I provided already and tries to stick it to the Google Colab environment.

I totally understand. I saw the author of the article beforehand. The only reason to mention the article so that you might have an idea why the structure of the notebook is different than your readme and by improvisation, I meant in the structure of the code and few minor bits here and there. Nothing new that will contribute.

You need to understand that this code was not designed to run in Google Colab but you are still trying to run it there and ask for my help. I don't have to help you here and even release this repo to the public. Please be polite not only in how you write but also to others' time. Just because you did not manage to debug your work properly, right now you are asking me to do it which shows exactly the opposite.

Firstly, I have the utmost respect of the time you and every developer out there who provides the code for their papers. I never indented to waste your time as I already knew that probably there nothing was wrong with the code, but I was really stuck so I thought its worth mentioning as only one similar issue #21 was there. Secondly, I mentioned that I read your readme so that you have an idea that I have gone through your explanation of the code and tried everything else present there.

Thirdly, somehow github exaggerated by line, it was suppose to be this:

I even have read your readme file

but somehow it capitalized the word to make it look dramatic. 😂. But anyways, I have respect of your time and I appreciate that you took some time to go through my problem.

I hope now there is no misunderstanding. I understand your duty only extends till solving the errors to the specifications and conditions you provide in which the code will work, and as a developer myself, I respect that

I am planning to add a Google Colab notebook to the repo which will support the single video example. If you want to author a PR, please create a .ipynb notebook and I will merge it. If you will decide to do so, please clean and reorganize your current version a bit.

If I'm able to make this notebook run successfully then sure, I would love to contribute back, after all the effort is all yours.

from bmt.

v-iashin avatar v-iashin commented on July 19, 2024

somehow it capitalized the word

I capitalized the word to point out what bothered me there.

from bmt.

xanthan011 avatar xanthan011 commented on July 19, 2024

Hello vladimir, I was able to run the colab file So thank you for the suggestion.

it unpacks the pre-trained glove zip which expands dramatically in size + torchtext resaves it (2.1G -> 5.3G + 2.6G)

I wanted to ask that if this 5.3G + 2.6G addition to the disk space will happen every time when we input a different video or is it a one time thing?

I am planning to add a Google Colab notebook to the repo which will support the single video example. If you want to author a PR, please create a .ipynb notebook and I will merge it. If you will decide to do so, please clean and reorganize your current version a bit.

And now since the colab file is running, I will clean and add texts to the cells in the file for explanation of the code in the cell and send a merge request in a couple of days.

from bmt.

v-iashin avatar v-iashin commented on July 19, 2024

Hi,

I haven't looked into this much but what torchtext does is it checks if the GloVe model is in .vector_cache and if not, downloads it from its own servers, if it is present (*.txt or *.txt.pt – I don't know) it will load it from the disk.

If you are curious, it is triggered by specifying cfg.word_emb_caps='glove.840B.300d'

CAPTION.build_vocab(dataset.caption, min_freq=cfg.min_freq_caps, vectors=cfg.word_emb_caps)

This might remind you the way torchvision.models.* are initialized.

Since the default server is quite slow, we mirrored the pre-trained GloVe model on our premises, so you can download it with high speed. This is just to give you a rough idea of what is happening there.

I am glad it worked out for you. Looking forward to seeing your PR.

from bmt.

xanthan011 avatar xanthan011 commented on July 19, 2024

Hello Vladimir,

I wanted to ask whether the repository works on short videos as I'm facing a problem.

You see, I wanted to get the captions from a gif (converted into an mp4 and then ran on the code).

So , while running this:

python main.py \
    --feature_type i3d \
    --on_extraction save_numpy \
    --device_ids 0 \
    --extraction_fps 25 \
    --video_paths/content/BMT/test/1419275450975539203.mp4 \
    --output_path /content/BMT/test 

It gives the rgb.npy and flow.npy files as output. But when I run this:

python main.py \
    --feature_type vggish \
    --on_extraction save_numpy \
    --device_ids 0 \
    --video_paths/content/BMT/test/1419275450975539203.mp4 \
    --output_path /content/BMT/test

There is no vggish.npy file as output which usually comes. So, I'm wondering if this an issue of the video being small or colab issue (seems unlikely).

I will also attach a 2 gif (in .mp4 format) files for reproducing.

Thank you

1418473181350793216.mp4
1419275450975539203.mp4

from bmt.

v-iashin avatar v-iashin commented on July 19, 2024

😅

VGGish extracts audio features.

By the way, PR with a notebook that uses the sample video (./sample) would do just fine.

from bmt.

xanthan011 avatar xanthan011 commented on July 19, 2024

VGGish extracts audio features.

ohh okay then my bad.

Then can you please give me an idea how should I run without that file as removing --vggish_features_path flag throws an error while running ./sample/single_video_prediction.py and since there is no vggish.npy file, I can't give it 😅?

from bmt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.