Giter Club home page Giter Club logo

Comments (11)

JingqingZ avatar JingqingZ commented on August 17, 2024

Hi Alex, I am afraid we can't share the output summaries because of some limitation. Please run the model and produce the outputs. Sorry for the inconvenience.

from pegasus.

alexgaskell10 avatar alexgaskell10 commented on August 17, 2024

Thanks for coming back to me. Just trying to do this now and I have some follow up questions:

  1. Is the model fine-tuned on CNN-DM available for download?
  2. If answer to above is no, how long does fine-tuning take?
  3. How do I get the model to write output summaries?

Thanks!

from pegasus.

JingqingZ avatar JingqingZ commented on August 17, 2024

The fine-tuned models of all 12 datasets including CNN/DM are available. Please follow the README and download.

from pegasus.

alexgaskell10 avatar alexgaskell10 commented on August 17, 2024

Thanks for your help so far.

I have got everything set up and am running eval on the test set. It is running slow, even with GPU. It is taking 30s per sample and this doesn't fall if I increase batch size. Is this normal?

from pegasus.

JingqingZ avatar JingqingZ commented on August 17, 2024

I suspect the model may actually run on CPU. Please run nvidia-smi and check if the model uses GPU. The model should take at least 10000MB GPU memory (or whole GPU memory).

Others may be worth checking:

  • CUDA 10.0
  • tensorflow-gpu is installed instead of tensorflow
  • Check the printout log of the model and see if anything unexpected regarding GPU.

from pegasus.

alexgaskell10 avatar alexgaskell10 commented on August 17, 2024

All looks to be using the GPU normally... Do you have an estimate for how long running an eval loop should take?

Nvidia-smi gives:
Screenshot 2020-06-16 at 22 37 10

And the following logs on loading:
Screenshot 2020-06-16 at 22 36 45

CUDA: release 10.0, V10.0.130

I have both Tensorflow and Tensorflow-gpu installed but it seems to be importing tf-gpu

from pegasus.

JingqingZ avatar JingqingZ commented on August 17, 2024

FYI, I used batch_size=8 and it took around 10min for 100 CNN/DM examples on a single TITAN X.

from pegasus.

alexgaskell10 avatar alexgaskell10 commented on August 17, 2024

Hmmm, weird how it's taking 5x as long for me as it did for you. Have had a bit of a look but nothing obvious I can see. I will have to park this now and move on.

from pegasus.

alexgaskell10 avatar alexgaskell10 commented on August 17, 2024

Gave this one final try on a different GPU machine and it is working much better so am closing the issue. Thanks for your help!

from pegasus.

jacob-parnell-rozetta avatar jacob-parnell-rozetta commented on August 17, 2024

@alexgaskell10 out of curiosity, what were the specs of the new GPU machine you tested this on? I am running into a similar issue, and it takes quite a while to run an eval loop on CNN/DM, even increasing batch_size doesn't help.

from pegasus.

alexgaskell10 avatar alexgaskell10 commented on August 17, 2024

So the slow version was using Azure's NC6 series (believe they are Tesla K80s) and the faster version was using GeForce GTX TITAN X. Both have 12Gb RAM. This made it much quicker but was still only half the speed Jingqing was reporting.

from pegasus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.