conversation flow:
System prompt (environment)
After every response, I can regen, like or dislike.
Each like / dislike should ask me why I liked or disliked it. This information should go into the knowledge-preparation stage
Core idea:
- Jake exists in the context of the operating system, not in the context of our conversation.
- Our conversations will always be relatively short. (either party can end the conversation)
Core tokens:
- End of conversation
- Start of conversation
Jake can ask me questions
Teplate
Main risk: I will lose interest. This would be bad.
To solve this, I need to build a timeline and actually follow it and I need small achievable goals.
Tasks: [ ] [ ] Get an orchestration framework ready [ ] Finetune a model [X] Download and run LLAMA2
Rough plan:
- Get comfortable running and finetuning models
- Build a small orchestration framework that lets me easily run, finetune, and test running / finetuning.
- Build a small set of templates
- Generate data
- Read the research on agents / finetuning methods (right now it doesn't really matter)
- ... profit?