์๋
ํ์ธ์. KOCLIP ํ์ต ์งํ ๋์ค ์๋ฌธ์ ์ด ์๊ฒจ ์ง๋ฌธ์ ๋๋ฆฝ๋๋ค.
- ํ์ต์ ์งํํ๋ฉด Loss ์ Eval Loss๊ฐ ํญ์ ๋์ผํฉ๋๋ค. (Learning Rate๋ ๊ณ์ ์ค์ด๋ฌ)
์ ์ ๋ฐ์ดํฐ๋ง ๊ทธ๋ฐ๊ฒ ์๋๋ผ, ์์๋ก ์๋ coco ๋ฐ์ดํฐ๋ ๋์ผํฉ๋๋ค.
์ด๊ฒ ์ ์์ ์ธ ํ์ต์ด ๋ง๋๊ฑด์ง,, ํ์ธ ์์ฒญ ๋๋ฆฝ๋๋ค.
1-1 . KoCLIP ์์ ์ ๊ณตํด์ฃผ๋ coco ๋ฐ์ดํฐ์ train.sh ๋ฅผ ์ด์ฉํ์ฌ ํ์ต
- Eval Loss ๋ Epoch 2๋ถํฐ ๊ณ์ ๋์ผ. ๊ทธ๋ฅ Loss ๋ Epoch 3๋ถํฐ ๋์ผ
09/04/2023 11:01:46 - INFO - main - ***** Running training *****
09/04/2023 11:01:46 - INFO - main - Num examples = 413915
09/04/2023 11:01:46 - INFO - main - Num Epochs = 40
09/04/2023 11:01:46 - INFO - main - Instantaneous batch size per device = 64
09/04/2023 11:01:46 - INFO - main - Total train batch size (w. parallel & distributed) = 64
09/04/2023 11:01:46 - INFO - main - Total optimization steps = 258680
Epoch... (1/40 | Loss: 4.158902168273926, Learning Rate: 4.8750189307611436e-05)
Epoch... (1/40 | Eval Loss: 4.158883094787598)
Epoch... (2/40 | Loss: 4.158882141113281, Learning Rate: 4.7500190703431144e-05)
Epoch... (2/40 | Eval Loss: 4.1588826179504395)
Epoch... (3/40 | Loss: 4.158883094787598, Learning Rate: 4.625019209925085e-05)
Epoch... (3/40 | Eval Loss: 4.1588826179504395)
Epoch... (4/40 | Loss: 4.158883094787598, Learning Rate: 4.5000189857091755e-05)
Epoch... (4/40 | Eval Loss: 4.1588826179504395)
Epoch... (5/40 | Loss: 4.158883094787598, Learning Rate: 4.375019125291146e-05)
Epoch... (5/40 | Eval Loss: 4.1588826179504395)
1-2. ์ค๋นํ ํ์ต์ฉ ๋ฐ์ดํฐ์ train.sh ๋ฅผ ์ด์ฉํ์ฌ ํ์ต
- Loss ์ Eval loss ๋ชจ๋ Epoch 1๋ถํฐ ๊ณ์ ๋์ผ (Epoch 4์ Eval loss ๋ค๋ฆ)
08/31/2023 15:16:15 - INFO - main - ***** Running training *****
08/31/2023 15:16:15 - INFO - main - Num examples = 2474242
08/31/2023 15:16:15 - INFO - main - Num Epochs = 40
08/31/2023 15:16:15 - INFO - main - Instantaneous batch size per device = 64
08/31/2023 15:16:15 - INFO - main - Total train batch size (w. parallel & distributed) = 64
08/31/2023 15:16:15 - INFO - main - Total optimization steps = 1546400
Epoch... (1/40 | Loss: 4.158883094787598, Learning Rate: 4.8750029236543924e-05)
Epoch... (1/40 | Eval Loss: 4.1588826179504395)
Epoch... (2/40 | Loss: 4.158883094787598, Learning Rate: 4.750003063236363e-05)
Epoch... (2/40 | Eval Loss: 4.1588826179504395)
Epoch... (3/40 | Loss: 4.158883094787598, Learning Rate: 4.625003202818334e-05)
Epoch... (3/40 | Eval Loss: 4.1588826179504395)
Epoch... (4/40 | Loss: 4.158883094787598, Learning Rate: 4.500002978602424e-05)
Epoch... (4/40 | Eval Loss: 4.158883094787598)
Epoch... (5/40 | Loss: 4.158883094787598, Learning Rate: 4.375003118184395e-05)
Epoch... (5/40 | Eval Loss: 4.1588826179504395)
Epoch... (6/40 | Loss: 4.158883094787598, Learning Rate: 4.250002893968485e-05)
Epoch... (6/40 | Eval Loss: 4.1588826179504395)
Epoch... (7/40 | Loss: 4.158883094787598, Learning Rate: 4.125003033550456e-05)
Epoch... (7/40 | Eval Loss: 4.1588826179504395)
Epoch... (8/40 | Loss: 4.158883094787598, Learning Rate: 4.000003173132427e-05)
Epoch... (8/40 | Eval Loss: 4.1588826179504395)
Epoch... (9/40 | Loss: 4.158883094787598, Learning Rate: 3.875002948916517e-05)
Epoch... (9/40 | Eval Loss: 4.1588826179504395)
Epoch... (10/40 | Loss: 4.158883094787598, Learning Rate: 3.750003088498488e-05)
Epoch... (10/40 | Eval Loss: 4.1588826179504395)
์ด๋ ๊ฒ 25 ์ํญ๊น์ง ๋๋ฆฌ๋ค๊ฐ ๋์ ํ ์๋ ๊ฒ ๊ฐ์์ ์ข
๋ฃ ํ์ต๋๋ค.
- configuration ํ์ผ ๋ฐ weight ํ์ผ ์ ์ฅ
ํ์ฌ train.sh ๋ฐ run.py ๊ตฌ์ฑ์ผ๋ก ํ์ต์ ์งํํ๋ฉด
์ํญ์ ๋ ๋ ๋ง๋ค
Configuration saved in /home/test/koclip/checkpoint/config.json
Model weights saved in /home/test/koclip/checkpoint/flax_model.msgpack
์ด๋ ๊ฒ ํญ์ ๊ฐ์ ๊ฒฝ๋ก์ ํ์ผ์ ๋ฎ์ด์ฐ๊ฒ ๋๋๋ฐ
ํญ์ ๋ชจ๋ ๊ฒฝ์ฐ์ ๋ฎ์ด ์ฐ๊ฒ ๋๋๊ฑด์ง ์๋๋ฉด, ์ต์ ์ ์ผ์ด์ค๊ฐ ๋ฐ๊ฒฌ๋๋ฉด ๊ทธ๋๋ง ๋ฎ์ด์ฐ๊ฒ ๋๋๊ฑด์ง ๊ถ๊ธํฉ๋๋ค.
๋ต๋ณ ์ฃผ์๋ฉด ๊ฐ์ฌํ๊ฒ ์ต๋๋ค!