Giter Club home page Giter Club logo

sohu2021-baseline's People

Contributors

bojone avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

sohu2021-baseline's Issues

Does evaluate() always return 0 result?

During the training, It seems evaluate() always returns 0 result.

so if it has problems with these accuracy caculation:
total_a += ((y_pred + y_true) * (flag == 0)).sum()
right_a += ((y_pred * y_true) * (flag == 0)).sum()
total_b += ((y_pred + y_true) * (flag == 1)).sum()
right_b += ((y_pred * y_true) * (flag == 1)).sum()
should be?
total_a += (flag == 0).sum()
right_a += ((y_pred == y_true) * (flag == 0)).sum()
total_b += (flag == 1).sum()
right_b += ((y_pred == y_true) * (flag == 1)).sum()

and f1 should remove "2*" also :
f1_a = right_a / total_a
f1_b = right_b / total_b

why can't do multi-GPUs traning???

my env configuration:
keras==2.3.1, tensorflow-GPU==2.2.0

I try to support multi-GPUs in one machcine, so I add simple code as below to include all model related codes:

strategy = tf.distribute.MirroredStrategy()
print('Number of devices: {}'.format(strategy.num_replicas_in_sync))
with strategy.scope():
xxx xxx
xxx xxx

I also try to set os.environ['TF_KERAS'] to "0" or "1".

I can see the process in two GPUs, but the last GPU-Util is always 0% as below:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P40 Off | 00000000:00:0E.0 Off | 0 |
| N/A 40C P0 155W / 250W | 21699MiB / 22919MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P40 Off | 00000000:00:0F.0 Off | 0 |
| N/A 31C P0 49W / 250W | 21659MiB / 22919MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

so what else need I do if I want to train in multi-GPUs???

训练过程中出现OOM

尝试了16G和32G两种显卡,模型加载完未训练时候都会把显存占满,训练完一个batch后在evaluate阶段会出现OOM

model_form

预训练模型有cpkt格式的链接吗,现在看到的都是.data, .index,.meta形式的

f1只有0.3

试了大佬的baseline,代码并没有改,不知道为啥,提交时f1只有0.3几,近乎随机预测了,训练时,loss在0.6几乎不变,acc在0.6几,f1在0.4几

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.