<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Any way to fit a large data set ( 17M samples ) in low VRAM ( 4 GB ) ? about nmt-chatbot HOT 10 CLOSED

daniel-kukiela commented on June 23, 2024

Any way to fit a large data set ( 17M samples ) in low VRAM ( 4 GB ) ?

from nmt-chatbot.

Comments (10)

Sentdex commented on June 23, 2024 1

We haven't yet made a setting for it, but the other integral item with VRAM usage while training is the batch size. ATM, batch size is 128, and it could be 64 or even 32 and still produce decent results. Maybe we should also expose batch_size in settings.py.

from nmt-chatbot.

daniel-kukiela commented on June 23, 2024

Did you change anything in settings? If yes, did you either uncomment 'override_loaded_hparams' or remove everything inside 'model' folder? hparams section from settings is saved with model with first train.py run.

from nmt-chatbot.

kunwar31 commented on June 23, 2024

Everytime I change my settings I remove everything inside 'model' folder and re-run prepare_data.py ( if changed vocab_size )

from nmt-chatbot.

kunwar31 commented on June 23, 2024

Okay I see that num_units is directly proportional to that, so I can reduce that number somehow and fit it. Should I be doing more epochs then If I go with a very low num_units value?

from nmt-chatbot.

daniel-kukiela commented on June 23, 2024

One more question at the moment - did you clone that repo at it's current state or are you using your fork made at some point? There were a bug associated with training set creation.

from nmt-chatbot.

daniel-kukiela commented on June 23, 2024

With 4GB of VRAM you should probably use defaults. I was training models using my GTX 970 with 4GB of VRAM with that settings (but with slightly bigger vocab of size of 17k tokens).

from nmt-chatbot.

kunwar31 commented on June 23, 2024

I cloned it a few days back... 3 Dec I guess. The problem with defaults is that it won't work on my data set but works on sample data set. Does size of each line matter as well? for example, If all comments are less than , let's say, 1000 characters, would it help reduce how much memory is required?
How many training samples did you use in your data set?

from nmt-chatbot.

daniel-kukiela commented on June 23, 2024

I fixed that bug in Decemeber 3rd, so please do a fresh clone and try again.

from nmt-chatbot.

kunwar31 commented on June 23, 2024

@daniel-kukiela I'll do that and try again. Thank you :)
@Sentdex Thank you for the tip, I'll try using different batch_size(s). BTW, your videos are awesome :)

from nmt-chatbot.

kunwar31 commented on June 23, 2024

@daniel-kukiela @Sentdex I tried both of the things at the same time and lowering batch_size works, also that about that bug, I no longer need to remove empty lines from vocab manually. Thank you for help.
Also, I've made a pull request #9 for adding batch_size in settings.py

from nmt-chatbot.

Any way to fit a large data set ( 17M samples ) in low VRAM ( 4 GB ) ? about nmt-chatbot HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent