As students use platforms like Khanmigo and MagicSchool.ai for tasks like getting feedback on their essays and "chat" with the tool to improve their learning, it will quickly become difficult for teachers to keep up with the volume of chat data.
This project aims to experiment with methods to aggregate chat data to provide teachers with data to improve instruction.
For the first experiment we will focus on a tool that gives feedback on essays.
I used ChatGPT to generate the student essays and follow-up chat data. I used MagicSchool.ai's tool for providing feedback on essays for the AI messages.
I've used a simple Evaluation
data model to analyse ONLY the AI messages. I know that this approach has drawbacks but I want to keep it simple for now to see if it's remotely useful.
For each of the criteria below, we extract the following data:
strengths
weaknesses
suggestions
Criteria:
-
Introduction
- Clarity of thesis statement
- Engagement and relevance of opening statements
-
Structure
- Organization and clarity of paragraphs
- Logical flow of ideas
-
Argumentation
- Strength and clarity of arguments
- Use of critical reasoning
-
Evidence
- Relevance and quality of evidence
- Use of citations and references
-
Conclusion
- Restatement of thesis
- Summary of main points
- Closing statements
poetry install
The only experiment I have is in exp1.ipynb
which is self-explanatory.