LLM Finetuning Project - Implementation of Context Distillation as described in https://arxiv.org/abs/2209.15189. The model is applied on multiple choice prompts from the MMLU dataset
-
Using "Teacher_model_context_distillation" for extraction of of teacher probability distribution
-
"Context_Distillation_Final" notebook contains finetuned student model and the generated outputs
-
"Baseline Model" notebook evaluates the baseline performance of the teacher model on MMLU raw inputs.