Choose a language : Français · English
The goal is to develop a conversational chatbot using the Keras NLP library's GPT-2 model as a base.
It will be able to engage in natural conversations with users, answer their questions in a specific field (see Dataset part), and submit useful information.
The chatbot will answer in English.
You can refer to the subject SAE-IA.pdf (which is in French)
For each milestone (Livrable), a manager will be chosen.
Here are the team members and those who were managers. There is no manager for Livrable 1 since we just have to constitute a team for this milestone.
- Yassine BELLAGRAA (Livrable 6)
- Amadou DIA
- Salma BOUSSERHANE (Livrable 5)
- Walid OUBELLA
- Maxime NGUYEN (Livrable 3)
- Selma MAZGAR (Livrable 2)
- Chrinovic KIBANGU TSIMBA (Livrable 4)
When a student is a manager, he has some tasks to perform:
- Plan or replan tasks
- Communicate milestones to team members and the teachers
- Evaluate tasks completion
- Create a management report
- GPT-2 model from the Keras NLP library for fine-tuning and text generation: https://keras.io/keras_nlp/
- Gradio for the UI/frontend of the chatbot: https://www.gradio.app/
- Python Jupyter and Google Notebooks for testing the chatbot
The dataset we will use is from Kaggle and is about records from people affected by cancer : https://www.kaggle.com/datasets/falgunipatel19/biomedical-text-publication-classification
The code and the Gradio application will be submitted to a Hugging Face repository, which can be found here:
- (October 20th, 2023) - Livrable 1/
- Team composition, choice of managers, creating a Git repository accessible by team members and teachers
- (November 17th, 2023) - Livrable 2/
- Understand GPT-2 model and do ask some questions to the pre-trained model
- Analysis of legal conditions for use of initial data
- (December 18th, 2023) - Livrable 3/
- Data analysis with a word cloud and data retrieval
- (January 19th, 2024) - Livrable 4/
- Gradio prototype of the application and fine-tuning of the pre-trained model
- (February 16th, 2024) - Livrable 5/
- Report on the comparison between the pre-trained model and the fine-tuned model (first fine-tuning)
- (March 4th, 2024) - Livrable 6/
- Optimize performances of the fine-tuning based on the results of Livrable 5
- Understand the concepts of conversational chatbots and language models
- Skills in data preparation, model fine-tuning, and chatbot performance evaluation
- Skills in UI development for a great user experience
- Ability to document and present a complete chatbot project
- Create a personalized conversational chatbot, understand challenges linked to consistent text generation, and develop practical skills regarding interactive chatbots development