Comments (4)
会有一点影响,但是就现在你的硬件条件看,能跑已经不错了,不用奢求太多
from gpt2-chinese.
把config_small.json里的参数再改小,直到可以放得下为止,现在还是太大了
from gpt2-chinese.
我将batch_size设成4就可以过去(原先是8), 请问batch_size会影响输出的结果吗?
现在环境使用状况如下...
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 00000000:00:1B.0 Off | 0 |
| N/A 61C P0 128W / 150W | 5633MiB / 7618MiB | 89% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M60 On | 00000000:00:1C.0 Off | 0 |
| N/A 74C P0 123W / 150W | 3857MiB / 7618MiB | 32% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla M60 On | 00000000:00:1D.0 Off | 0 |
| N/A 65C P0 126W / 150W | 3857MiB / 7618MiB | 50% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla M60 On | 00000000:00:1E.0 Off | 0 |
| N/A 73C P0 121W / 150W | 3857MiB / 7618MiB | 69% Default |
+-------------------------------+----------------------+----------------------+
我将batch_size调成4就可以过去
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 4291 C python3 5622MiB |
| 1 4291 C python3 3846MiB |
| 2 4291 C python3 3846MiB |
| 3 4291 C python3 3846MiB |
+-----------------------------------------------------------------------------+
from gpt2-chinese.
了解~感谢指导~
from gpt2-chinese.
Related Issues (20)
- 能不能用gpt3再创造个中文ai,gpt3比gpt2好很多
- 语料库700M,一跑就报This script is extremely slow,请教700M语料大约需要多大的配置
- Mark
- Mark
- GPU9GB能跑起来吗请问
- 是否支持多机多卡分布式训练? HOT 2
- 如何修改訓練結構?
- finetune需要的语料量+output上限 HOT 2
- 求助 HOT 2
- 请问如何实现问答任务 HOT 2
- 语料预处理
- 下载的对联模型使用时为什么生成的是乱码? HOT 1
- 你好,很期待和您沟通,请查收gmail 邮件。
- 请问GPT2-Chinese 参数量有多大呀,跟原版gpt2参数量一样吗?
- BertTokenizer的__init__报NotImplementedError HOT 3
- Using bos_token, but it is not set yet. Using eos_token, but it is not set yet. HOT 1
- train. py encountered an error during runtime. Value Error: invalid literal for int() with base 10: '[SEP]' How to solve it? HOT 1
- 求助求助 HOT 2
- json不对,无法训练 HOT 1
- 预训练模型的名字什么鬼
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gpt2-chinese.