์๋
ํ์ธ์.
๋จผ์ ๊ณต๊ฐํด์ฃผ์ ์ฝ๋ ์ ์ฉํ๊ฒ ์ฌ์ฉํ๊ณ ์์ต๋๋ค. ์ ๋ง ๊ฐ์ฌํฉ๋๋ค.
XLM-RoBERTa-large ๋ชจ๋ธ์ Fine-tuning ๊ด๋ จํด์ ์ง๋ฌธ์ด ์์ด ๊ธ์ ๋จ๊ธฐ๊ฒ ๋์์ต๋๋ค.
๊ณต๊ฐํด ์ฃผ์ ์ฝ๋๋ XLM-RoBERTa-base๊น์ง ๋ค๋ฃจ์ง๋ง ์ถ๊ฐ์ ์ผ๋ก ์คํ์ ํ๊ณ ์ถ์ด config์ run_*.py ์ฝ๋๋ฅผ ์ฝ๊ฐ ์์ ํ์ฌ XLM-RoBERTa-large ๋ชจ๋ธ์ ์ ์ฉํด๋ณด์์ต๋๋ค.
๋ค๋ฅธ task์์๋ XLM-RoBERTa-base ์ด์์ ์ฑ๋ฅ์ ์ป์๋๋ฐ, korquad ์์ ํฐ๋ฌด๋ ์๊ฒ ๊ฑฐ์ 0์ ๊ฐ๊น์ด ๊ฒฐ๊ณผ๊ฐ ๋์์ต๋๋ค.
์์ ์ฌํญ์ ๋ค์๊ณผ ๊ฐ์ต๋๋ค.
- config/korquad/xlm-robert-large.json์ ์ถ๊ฐํ์๊ณ , ๋ด์ฉ์ XLM-RoBERT-base.json์์ ๋ชจ๋ธ์ xlm-robert-large๋ก๋ง ๋ฐ๊พธ์๊ณ ๋์ผํ ์
ํ
์ ์ ์งํ์ต๋๋ค.
{ "task": "korquad", "data_dir": "data", "ckpt_dir": "ckpt", "train_file": "KorQuAD_v1.0_train.json", "predict_file": "KorQuAD_v1.0_dev.json", "threads": 4, "version_2_with_negative": false, "null_score_diff_threshold": 0.0, "max_seq_length": 512, "doc_stride": 128, "max_query_length": 64, "max_answer_length": 30, "n_best_size": 20, "verbose_logging": true, "overwrite_output_dir": true, "evaluate_during_training": true, "eval_all_checkpoints": true, "save_optimizer": false, "do_lower_case": false, "do_train": true, "do_eval": false, "num_train_epochs": 7, "weight_decay": 0.0, "gradient_accumulation_steps": 1, "adam_epsilon": 1e-8, "warmup_proportion": 0, "max_steps": -1, "max_grad_norm": 1.0, "no_cuda": false, "model_type": "xlm-roberta-large", "model_name_or_path": "xlm-roberta-large", "output_dir": "xlm-roberta-large-korquad-ckpt", "seed": 42, "train_batch_size": 4, "eval_batch_size": 16, "logging_steps": 4000, "save_steps": 4000, "learning_rate": 5e-5 }
- run_squad.py ์์๋ token_type_ids๋ฅผ ์ ๊ฑฐํ๋ ๋ถ๋ถ์ xlm-roberta-large๋ฅผ ์ถ๊ฐํ์ต๋๋ค.
if args.model_type in ["xlm", "roberta", "distilbert", "distilkobert", "xlm-roberta", "xlm-roberta-large"]: del inputs["token_type_ids"]
3. /src/util.py ์์ CONFIG_CLASSES, TOKENIZER_CLASSES, MODEL_FOR_QUESTION_ANSWERING
์ ๊ฐ๊ฐ "xlm-roberta-large": AutoConfig, "xlm-roberta-large":AutoTokenizer, "xlm-roberta-large": AutoModelForQuestionAnswering๋ฅผ ์ถ๊ฐํ์ต๋๋ค.
๋ชจ๋ธ์ evaluation ๊ฒฐ๊ณผ์
๋๋ค.
12/24/2020 11:49:46 - INFO - __main__ - ***** Official Eval results ***** 12/24/2020 11:49:46 - INFO - __main__ - official_exact_match = 0.05195704883 962591 12/24/2020 11:49:46 - INFO - __main__ - official_f1 = 6.183122770846761 12/24/2020 11:49:47 - INFO - __main__ - HasAns_exact = 0.05195704883962591 12/24/2020 11:49:47 - INFO - __main__ - HasAns_f1 = 1.1654331969028693 12/24/2020 11:49:47 - INFO - __main__ - HasAns_total = 5774 12/24/2020 11:49:47 - INFO - __main__ - best_exact = 0.05195704883962591 12/24/2020 11:49:47 - INFO - __main__ - best_exact_thresh = 0.0 12/24/2020 11:49:47 - INFO - __main__ - best_f1 = 1.1654331969028693 12/24/2020 11:49:47 - INFO - __main__ - best_f1_thresh = 0.0 12/24/2020 11:49:47 - INFO - __main__ - exact = 0.05195704883962591 12/24/2020 11:49:47 - INFO - __main__ - f1 = 1.1654331969028693
training ์ ์
๋ ฅ์ด ์ ๋๋ก ๋ค์ด์ค๋ ์ง๋(tokenizer ๋ฌธ์ ๋ก ์ ๋ถ [UNK]๋ก convert๋๋ ๋ฑ์ ๋ฌธ์ ) ํ์ธํ์ผ๋ ๋ฌธ์ ๊ฐ ์์์ต๋๋ค.
ํน์ ์ด๋ค ๋ถ๋ถ์์ ๋ฌธ์ ๊ฐ ๋ฐ์ํ ์ ์๋์ง ์์ ๊ฐ์๋ ๋ถ๋ถ์ด ์๋์?
๊ฐ์ฌํฉ๋๋ค.