Giter Club home page Giter Club logo

Comments (16)

yunjey avatar yunjey commented on May 22, 2024

I remember that the required gpu memory for batch_size=128 is less than 5GB (it may be much smaller).

What's your Python and PyTorch version? I guess you are using Python 2.7. Am i right?
This memory issue occurred when requires_grad=False does not work.

There are two options for solving this problem.

  1. Upgrade PyTorch version
  2. Use Python 3.5 instead of 2.7

from pytorch-tutorial.

karandwivedi42 avatar karandwivedi42 commented on May 22, 2024

This might be related to #26

This is a known issue which will be resolved in the next release.

Till then as a workaround, just change L56 to

               images = Variable(images, volatile=True)

and L66 to

                features = encoder(images)
                features = Variable(features.data)

from pytorch-tutorial.

jtoy avatar jtoy commented on May 22, 2024

@yunjey I am on python 2.7 and pytorch 0.12. I will try your changes.
@karandwivedi42 I will also test your fix.
I will let you both know.

from pytorch-tutorial.

yunjey avatar yunjey commented on May 22, 2024

@jtoy I recommend you to install PyTorch using source. This will give you the latest version of PyTorch.

from pytorch-tutorial.

jtoy avatar jtoy commented on May 22, 2024

I tried with pytorch python 2.7 source and using pytorch for python 3.5, both died with the same issue.

from pytorch-tutorial.

jtoy avatar jtoy commented on May 22, 2024

@karandwivedi42 your changes work! @yunjey will the code need to be updated? It seems like source doesnt seem to fix the issue. I can do more testing if needed.

from pytorch-tutorial.

yunjey avatar yunjey commented on May 22, 2024

@jtoy Ok. Thanks.

from pytorch-tutorial.

yunjey avatar yunjey commented on May 22, 2024

@karandwivedi42 That does not work.

images = Variable(images, volatile=True)

The code above makes requires_grad=False in resnet.fc. See here for the details of volatile.

from pytorch-tutorial.

karandwivedi42 avatar karandwivedi42 commented on May 22, 2024

@yunjey You are right. I don't know how important it is though because this linear layer is followed by another linear layer in the decoder with no non-linearity in between.

from pytorch-tutorial.

jtoy avatar jtoy commented on May 22, 2024

so what is the right code to use? I was able to train a model with @karandwivedi42 's change and the model completed training for me in 155 minutes. does that time seem right? I trained the original show and tell model and I remember it taking at least a day.

from pytorch-tutorial.

karandwivedi42 avatar karandwivedi42 commented on May 22, 2024

from pytorch-tutorial.

jtoy avatar jtoy commented on May 22, 2024

@karandwivedi42 I dont fully understand, Im just starting to play with pytorch, any way to see it as a diff ?

from pytorch-tutorial.

karandwivedi42 avatar karandwivedi42 commented on May 22, 2024

@jtoy This fork is a very hacky way to do exactly what the original code does.
https://github.com/karandwivedi42/pytorch-tutorial/tree/master/tutorials/09%20-%20Image%20Captioning

@yunjey Can you please check this one? (Thanks for the amazing tutorials btw :) )

from pytorch-tutorial.

yunjey avatar yunjey commented on May 22, 2024

@jtoy @karandwivedi42 I will fix the code by this weekend.

from pytorch-tutorial.

jtoy avatar jtoy commented on May 22, 2024

from pytorch-tutorial.

yunjey avatar yunjey commented on May 22, 2024

@jtoy @karandwivedi42 I modified the code. Try it. Thanks :)

from pytorch-tutorial.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.