The nfcapstone-chatgpt-promptoptimizer from hendrik2319

Variables

As a User
I need variables in prompts
to be able to test the prompts with different inputs.

Description

The expected use case of such ChatGPT prompts is to get answers / texts to specific input texts. These input texts have to be inserted into the prompt (... template) and the final prompt will be send to ChatGPT to get the answer. To define the position for insert, a variable or placeholder in the prompt is needed. It would be useful to define more than one variable. The default type of inputs should be text / string, but it could be also a list of texts / strings.

Acceptance Criteria

Define variables
Define placeholders in prompt with "... {varname} ..."
Each placeholder can be used multiple times in a prompt
More than one variable can be defined for one prompt
Placeholders should be highlighted visually in display of prompt (e.g. by text color or background color)

Development Tasks

Definition of Variables
- define a name
- [ value and type of value (single text or list of texts) will be defined in each test case ]
Storage of definitions in DB
- each ~~Series of Tests~~ TestRun have its own Variables definition
Display of placeholders in prompt
- highlighting of placeholders with colors (text and/or background, fixed predefined colors)
  -> displaying of prompt (text with <span>s) is different to editing prompt (<textarea>)
- visual: editing component replaces display component and vice versa

Datastructures:

Additional Things

Cost Control

As a User
I need a kind of cost control
to be able to limit the charges caused by using the API.

Description

Each call to the API of OpenAI causes charges to the owner of the used organization key.
To limit these costs while using the app, some sort of cost control and documentation of these costs is required.
These charges are based on the amount of language tokens the AI has processed. Each call responses the amount of tokens causes by processing the prompt, by processing the answer (completion) and in total. These tokens should be documented for each call and aggregated up to the Test Scenarios and overall for each user or the entire app.

Acceptance Criteria

document tokens
* for each call
aggregate the tokens
* for a test run
* for a whole scenario
* for a user
* for the whole app (visible for Admins)

Development Tasks

extend response of ChatGptService by token amounts (Usage)
store these values in DB together with the API answer
implement aggregations
display values and aggregations in frontend

Create First Minimal Solution

As a User
I need a minimal solution of the project
to be able to get a first impression of the tool.

Description

Build a minimal solution with access to ChatGPT API. A simple frontend with on input field for the prompt, a send button and an output field for the answer of ChatGPT is quite enough.

Acceptance Criteria

access to ChatGPT API
frontend with input, output and send button
all tests

Development Tasks

Answer Rating

As a User
I need a sort of rating and evaluation of the API answer
to be able to see easily, if the answer meets my expectations or not.

Description

The app should do some simple rating or evaluation of the answers of the API. For example, it could check, if an answer contains of more than one word or similar things. I should be able to switch these checks on and off.
If all checks of an answer are passed, this answer should displayed with a green color (text or background or an icon). if not, a red mark should be displayed.

Acceptance Criteria

define some checks (number of words)
display for each answer if all checks are passed or not (green / red)

Development Tasks

implement the "number of words" check
extend the display of an answer with this coloring or add an appropriate icon
add a selection panel for checks, where user can choose, which checks are active for test run
add a summary for a test run how many answer passed all checks

Scenario & Test Log

As a User
I need logging of test
to be able to see the evolution of a prompt and document the progress.

Description

Each passed test run should produce a log entry with all data of the test run. This data should contain everything needed for the test run, so that the user can change all settings (variable definitions, test cases, ...) at any time and these changes are documented for each test run. Each log entry should contain date and time of the run.
If there are some benchmark values for a test run, a graph of these values should be displayable.

All these log entries together build a Test Scenario.
A Test Scenario should have a name and an author / owner / assigned user.

Acceptance Criteria

input form to create a Test Scenario
a view to list all test runs
a view to configure the next test run

Development Tasks

Datastructures:

Feature Planning - Testing Procedure

As a User
I need this app
to be able to optimize my prompts

Description

The base idea of this task is to clearify the base testing procedure and to determine the needed features to do this.

Testing Procedure

At first I want to test a prompt multiple times (Series of Tests), that means:

I test it --> see that changes in prompt are needed
change the prompt
test it again
....
All theses tests will be stored for documentation.

The second feature is the ability to define Variables. Each Variable can be a single string or a list of strings. Variables have placeholders in the prompt to define the position, where to insert the value of the Variable. Multiple Variables can be defined and each Variable can have multiple placeholders in the prompt.

The third feature is the definition of Test Cases. Each Test Case is a set of values of Variables, one value for each Variable. As said before, a value can be single string or a list of strings. If a test case contains a list of string for a Variable, multiple answers will be generated for this Test Case.

These Series of Tests, Variable definitions and list of Test Cases build a Test Scenario, that will be stored as a document in a database.

To ensure that no unauthorized access to ChatGPT API can be performed a User Management ist needed too. Otherwise unrestricted access to the app and so to the API will cause uncontrolled raise of costs caused by the API usage. The User Managment should base on widely used accounts like GitHub or Google. Each user should log in with his/her account and have to be granted access by an admin. So three roles are needed: Admin, User and Unknown Account.

To keep costs in perspective, the tokens caused by each API call should be documented. Theses tokens should be documented for each call and aggregated up to the Test Scenarios and overall for each user or the entire app.

We need optimization criteria for prompts and also some kind of evaluation of API responses.

Features

User Management

As an Administrative Person
I need a user management
to be able to restrict the access to the app.

Description

A user management should be based on Github accounts. There should be Admins and Users. A newly signed up account is initially an Unknown Account, that should be raised to an User by an Admin. Only Users and Admins can have acccess to the app.

Acceptance Criteria

Login with Github account
Admin interface (/admin)
Role management: Admin, User, Unknown Account
Initial Admin defined via environment varaiable
only Users and Admins can use the app
Admins have to raise an Unknown Account to a User
Users can see their own data
Admins are additionally able to see all data
Admins can set up Users from scratch (maybe with use of GitHub API)

Development Tasks

Test Cases

As a User
I need Test Cases
to be able to define the values of variables for a single test run.

Description

In a single test run, a prompt should be tested with multiple values for each variable to obtain a wide range of possible responses from ChatGPT and to determine whether a prompt is stable or not. A stable prompt is one that provides usable responses for all expected inputs.
These values for the variables are summarized in test cases. Each variable can be set with a single string or a list of strings.

Acceptance Criteria

Each Test Case expects an input for all defined variables
Input of a single string or a list of string should be possible
multiple test cases should be definable

Development Tasks

an input form component for the value of a variable
- a compact variant with the list of defined values and a single input field to add a new value
- a bigger variant to modify all values and an empty input field at end to add a new value
an input component for a Test Case
- a select for the variables
- an input component for the selected variable
- a display to show which variables need a value definition or that all variables have values
a compact component to show the content of a Test Case to be used in the list of Test Cases

Datastructures:

Prompt Optimization Criteria

As a User
I need a some optimization criteria
to be able to evolve a prompt.

Description

The main purpose of the app is to optimize a ChatGPT prompt. So its essential to have some optimization criteria for each prompt. It could be a numeric value or a rating or a boolean criteria.

A obvious cirteria would be the charges per API call.
Each call to the API of OpenAI causes charges to the owner of the used organization key. These charges are based on the amount of language tokens the AI has processed. Each call responses the amount of tokens causes by processing the prompt, by processing the answer (completion) and in total.
An average value for charges per API call can be computed for each test run. This value can be stored for the corresponding prompt. A chart of this value over all prompt versions in a Test Scenario could be useful.

If more criteria are found in the future, they can be added here.

Acceptance Criteria

Compute average tokens per call for a whole test run
Display the value in a summary
Display a chart

Development Tasks

aggregate "tokens per call" over all test results in a test run
create summary
chart
* add ~~vis.js~~ react-chartjs-2 to project
* add chart to view of a test scenario

API State Display

As a User
I need an API State Display
to be able to see if the API is disabled or not.

Description

Next to the page head line I need a small display if the API is disabled or not.

Acceptance Criteria

a simple indicator that shows "API: Enabled" or "API: Disabled"
green for enabled and red for disabled

Development Tasks

a small component
* with its own update call to backend
* with some text
a new endpoint in controller: /api/apistate

Dark Mode Switch

As a User
I need a dark mode switch
to be able to adapt the look of the app to my browser or system settings

Description

This component should switch between Dartk, Light and System mode.

Acceptance Criteria

component determins the current state
it stores the value among sessions

hendrik2319 / nfcapstone-chatgpt-promptoptimizer Goto Github PK

nfcapstone-chatgpt-promptoptimizer's Introduction

nfcapstone-chatgpt-promptoptimizer's People

Contributors

Stargazers

Watchers

nfcapstone-chatgpt-promptoptimizer's Issues

Description

Acceptance Criteria

Development Tasks

Description

Acceptance Criteria

Development Tasks

Description

Acceptance Criteria

Development Tasks

Description

Acceptance Criteria

Development Tasks

Description

Acceptance Criteria

Development Tasks

Description

Testing Procedure

Features

Description

Acceptance Criteria

Development Tasks

Description

Acceptance Criteria

Development Tasks

Description

Acceptance Criteria

Development Tasks

Description

Acceptance Criteria

Development Tasks

Description

Acceptance Criteria

Recommend Projects

Recommend Topics

Recommend Org