instructlab / ui Goto Github PK

Place to hack on UI for InstructLab

License: Apache License 2.0

Dockerfile 0.25% TypeScript 91.68% JavaScript 0.45% CSS 1.69% Makefile 2.59% PostScript 0.16% Go 3.18%

ui's Introduction

InstructLab UI

Project aims to provide a UI based interface to the contributors and reviewers to submit and review contribution to instructlab/taxonomy. The intention is to simplify the process of contribution by providing a user friendly interface, that doesn't require the user to have a deep understanding of tools required to contribute skill and knowledge to the taxonomy. This project also aims to provide a platform for the reviewers to efficiently review the contributions and provide feedback to the contributors.

Overview

Current scope of the project is to work on following personas:

Taxonomy Contributor: Person who wants to contribute a skill or a knowledge to the taxonomy.
Taxonomy Triager: Person who has expertise to review the contributions and provide feedback to the contributors.

The technical overview and developer docs for getting started can be found here.

Contributing

If you have suggestions for how instructlab/ui could be improved, or want to report a bug, open an issue! We'd love all and any contributions.

For more, check out the InstructLab UI Contribution Guide and InstructLab Community Guide.

Community Meeting

We have a weekly community meeting to discuss the project and contributions. Meeting happens every Wednesday 10AM PST. Please subscribe to the InstructLab Community Calendar following the instructions here. UI project meeting details are present in the calendar event.

Slack channel

Please subscribe to the InstructLab Slack workspace by following the instructions here. Once you are part of the workspace, you can join the #ui channel where most of the project related topics are discussed.

License

Apache 2.0

ui's People

Contributors

Stargazers

Watchers

Forkers

vishnoianil nerdalert mingxzhao toraponibm boosey aevo98765 gregory-pereira j3din00b nouveau muhammadkurdi-cs memalhot dominikkawka artreimus

ui's Issues

The chat 'Model Selector' defaults back to Granite-7b every time the page is refreshed

For users that are not using the Granite-7b model for chat inference, every time users visit another page and come back to http://localhost:3000/playground/chat or just refresh the page, the model selector defaults back to Granite-7b. This then causes the chat inference requests to fail because the incorrect model name is passed to the chat model server.

Possible solutions:

Add the ability for the user to select an option as their default option.
Model type wont change on a page reset. local web browser storage?

Broaden OAuth user pool beyond just ILab org members

Only allowing instructlab org members to OAuth won't scale.
There is a cost to org members along with it being too restrictive. E.g. a workshop shouldn't have to add everyone into the org.
The obvious answer would be to open it to anyone. This would require accelerating the plan for a daily limit on chats. If that is too much to get done before v1 we could limit chat based on ilab org membership.

Add the ability for the user to specify the taxonomy tree path for their submission

We got confirmation in the taxonomy standup that it is on the user to define the taxonomy tree file path in their submission.

Write high level traiger workflow

Write a document that contains

define traigers persona
current traiging workflow
relevant challenges with the current traiger workflow
possible solutions to address these challenges

If a Skill context is empty, do not include it in the submission

Only add a context if it is not null.

Refresh a PR after the PR is edited

I think there is a state that after a PR is edited and submitted a couple of times without refreshing the page eventually an error code 422 is sent back from the GitHub API. I'm guessing it's a state thing. If you go back to the dashboard and edit again it won't happen but worth the time to fix even if its a fairly unlikely workflow event.

Adding dev chat server running instructions

It would be really useful as a dev contributor if you could quickly spin a model inference server.

Two approaches could be taken here:

ilab cli setup and ilab serve instructions - link to existing documentation here.
Docker contained service to expose port 8000 and 8001 just in case users have another server running on 8000.

Acceptance Criteria:

Update https://github.com/instructlab/ui/blob/main/docs/development.md
add Dockerfile and config.yaml example.
Optional - push a instructlab serve container to dockerhub for easy running.

Bug: Fix Questions field to format long lines until multiline YAML

Example, see line 11:

https://github.com/brents-pet-robot/taxonomy-sub-testing/pull/98/files

CSS issue with build vs dev in chat

Similar to the button CSS issue, npm run build button looks smollish compared to npm run dev. Here is how it should look:

Setup prod and qa deployment for community UI

Nail down details around

Explore the available options to host the community UI
Write a proposal to share with the oversight committee for approval
Setup the prod and qa deployments

depends-on : #9

Editing submission Q&As add quotes around the edited fields

For example:

 - question: h2424
    context: h244h
    answer: h24222
  - question: fds
    context: '123456789'
    answer: '1234567890'
  - question: '12345'
    context: ''
    answer: '6789'

Modifying yaml.dump in src/app/edit-submission/skill/[id]/page.tsx should resolve it.

Issue Running InstructLab UI - Need Help with .env Variables

Hi team,

Thank you for developing the InstructLab UI! I'm excited to start using it, but I'm currently having trouble getting it up and running. The main issue seems to be understanding and setting the .env variables.

Here's what I have in my .env file:

IL_UI_ADMIN_USERNAME=admin
IL_UI_ADMIN_PASSWORD=password
OAUTH_GITHUB_ID=<OAUTH_APP_ID>
OAUTH_GITHUB_SECRET=<OAUTH_APP_SECRET>
NEXTAUTH_SECRET=your_super_secret_random_string
NEXTAUTH_URL=http://localhost:3000
IL_GRANITE_API=<GRANITE_HOST>
IL_GRANITE_MODEL_NAME=<GRANITE_MODEL_NAME>
IL_MERLINITE_API=<MERLINITE_HOST>
IL_MERLINITE_MODEL_NAME=<MERLINITE_MODEL_NAME>
GITHUB_TOKEN=<TOKEN FOR OAUTH INSTRUCTLAB MEMBER LOOKUP>
TAXONOMY_DOCUMENTS_REPO=github.com/<USER_ID>/<REPO_NAME>
NEXT_PUBLIC_TAXONOMY_REPO_OWNER=<GITHUB_ACCOUNT>
NEXT_PUBLIC_TAXONOMY_REPO=<REPO_NAME>

Could you please provide a brief explanation of what each variable does? Some of them seem self-explanatory (like OAUTH_GITHUB_ID/OAUTH_GITHUB_SECRET), but others are less clear.
Where can I obtain the values for each variable? For example, I'm not sure where to get the IL_GRANITE_API key or how to generate the NEXTAUTH_SECRET.

Any guidance would be greatly appreciated!

Thanks,
Jan

Show warning pop-up when user login

Show warning pop-up when user login to the UI. Warning text should contain information about, how skill and knowledge contribution is made on their behalf and what possible changes UI does to the user github account, such as cloning taxonomy in users own github account, cloning doc repo in user's github account for publishing a document publicaly etc.

Integrate storage component to the UI stack

Determine a strategy for taxonomy tree placement of submissions

Work with triagers to determine where to place UI submissions.
I think there is consensus that expecting the user to place the submission in a proper directory is generally not ideal. There has been mention in the taxonomy standup that placing the submission could be a triager function.
We can't do a flat directory since attribution.txt is a fixed filename and not referenced in qna.yml.

Add Dependabot to the repo

Add Dependabot to the repo for merging. Can reference Bot to most upstreams repo for how to do it.

Enhancement: Decide key technology choices to use for UI

Current code base uses next.js + typescript. Is this enough for the feature that we would like to deliver? Or do we need to explore some other UI technologies that is require or would prefer to use for building better UI?

It's good to nail this down, so we have some guidance on what we should use or what we should avoid.

When submitting a knowledge submission, make a clear error if the knowledge file hasn't been submitted yet

Right now you just get a notification that not all fields have been filled out. It's not super clear that a user needs to click the Submit Files button. Im reluctant to automatically do it since a user may go back and attach a new file.

Trim trailing yaml whitespaces

trailing spaces (trailing-spaces) https://github.com/instructlab-public/test-taxonomy/actions/runs/9897923104/job/27343539058?pr=12

[FEAT] PDF to Markdown convert

It would be nice to have a feature to enable user file uploads and conversion of PDF to markdown. This would enable us to iterate on the UX of submitting your own Taxonomy, as well as it could be leveraged to improve the Triager experience creating and auditing QNA submissions.

More info: https://instruct-lab.slack.com/archives/C076C9RNKSA/p1718381515763659

/cc @nerdalert @mingxzhao @vishnoianil

Add the ability to amend a user submission

The submission process has most of the kinks worked out. The back half of the process that needs addressing is updating the PR based on triager feedback.

Update project Readme with meeting/slack/maintainers details.

Pull creation request fails if the taxonomy repo fork doesn't exist in user account

Seems like if the taxonomy repo fork doesn't exist in user account, it tries to create the repo, but while fetching the bash SHA, it fails with following error message:

Failed to create pull request: Error: Failed to get base branch SHA
    at O (/app/.next/server/app/api/pr/skill/route.js:8:1259)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async w (/app/.next/server/app/api/pr/skill/route.js:6:5)
    at async /app/node_modules/next/dist/compiled/next-server/app-route.runtime.prod.js:6:36258```

Although i can see that the repo is cloned successfully. Once the repo is cloned, retrying to submit the skill/knowledge works.

cc: @nerdalert

Communicate to the user when a chat request is made with no prompt

At present the handleSubmit function for the chat page.tsx handles an empty prompt text input by just returning the function. An improvement would be to communicate to the user where they have gone wrong with an Alert.

Acceptance Criteria:

When a user submits a chat request with no prompt communicate this via an alert.

Amend skill doesn't populate the yaml properly

Amending the existing skill PR, doesn't generate the qna.yaml in the same way (Although yaml seems okay). Please see this example PR.

https://github.com/instructlab-public/test-taxonomy/pull/9/files

I amended the first set of question/answer/context and it updated the PR okay, but removed the quotes around string values.

https://github.com/instructlab-public/test-taxonomy/pull/9/files#diff-9a5614e04ed4f578174cb55c837d49280a2240e2a22b3d59853e0d4b83e61ea0R5

Add some basic Playwright UI tests

Feel free to ping @nerdalert for how to start with this. I will do it if no one picks it up.

Acting as a domain expert to add knowledge

I have domain expertise in Cell Biology (https://scholar.google.com/citations?user=ZvIBifMAAAAJ&hl=en). I will emulate a research scientist and anticipate a typical domain expert user interaction with the knowledge addition part of the UI. Issues will be raised if any improvements can be recommended.

Discuss first release of the UI project

We need to address few major questions to drive the community focus.

what all features do we want to include in our first release?
approximate timeline for the first release?
where should we host the first release?
do we want to enable any cost control mechanism?

If you have more questions that we should discuss in this context, please feel free to the list.

Image publishing github workflow is broken

16.74 
16.74 https://nextjs.org/docs/messages/module-not-found
16.74 
16.74 ./src/components/Contribute/Skill/index.tsx
16.74 Module not found: Can't resolve 'js-yaml'
16.74 
16.74 https://nextjs.org/docs/messages/module-not-found
16.74 
16.75 
16.75 > Build failed because of webpack errors

List of recently failed jobs
https://github.com/instructlab/ui/actions/workflows/images.yml

Fix the default location of the skill and knowledge files

All the skill contributions should put the relevant files under the "compositional_skills" directory tree and similarly all the knowledge contribution should put relevant files under "knowledge" directory tree. Otherwise, the labeler job won't tag these PRs with appropriate tags. If the PRs are not tagged will skill or knowledge, UI won't show users contribution on the dashboard.

Publish container image of ui to ghcr.io

We need the container image to deploy it in the kubernetes/docker-compose deployments.

ui started in prod mode doesn't pickup proper styling for download buttons

this is how the button looks like when npm started in prod setting, doesn't happen in dev mode.

Make changes to the 'Knowledge' section of the 'Knowledge Contribution Form'

Changes to be made:

Change the section name from 'Knowledge' to 'Question and Answers'.
Ensure/strongly advise a minimum of 5 Q and As are provided by the user.
Each qna.yaml file requires a minimum of five question-answer pairs. The qna.yaml format must include the following fields:
https://github.com/instructlab/taxonomy?tab=readme-ov-file#getting-started-with-knowledge-contributions
Encourage the users to add more Q and As to get better synthetically generated data.
Reduce the span of the + Add Question and Answer button.

Happy for feedback on this. Also I will split these tasks out into individual issues if this is thought to be better.

Create an Oauth App in the instructlab account to be used by the UI

Once we have the public address and/or ideally a DNS we have all we need for the OAuth callback address.

Write manifest files to deploy UI stack in kubernetes/openshift cluster

Add schema version to submissions

Add version: 2 to knowledge/skill submissions.

Write document to capture various persona's that UI workflow should support

PoC for uploading Knowledge Docs to a repo and extracting the SHA for a submission.

Attempt at automating the knowledge document submissions to simplify the UX, instead of expecting a user to upload docs to their own repo and find a SHA for referencing in the knowledge submission.

Write high level contributor workflow

Write a document that contains

the target contributor
current contributor workflow
relevant challenges with the current contributor workflow
possible solutions to address these challenges

Create a template repo in upstream for forking for knowledge store and reference

We need a template repo for knowledge document submissions in instructlab.
This will allow for users to upload docs by automating the fork and uploading, PR, SHA retrieval by the UI. It also satisfies the constraint to not store documents themselves in upstream for liability reasons.
This will probably require a dev-doc submission to get a repo cut in the UI directory there.
See workflow:

Add description for skill and knowledge

It would be beneficial to users to understand a breakdown of what a knowledge and skill contribution are before choosing which one to submit. I think this could go right below the "Knowledge Contribution Form" or "Skill Contribution Form" that way a new user knows which is the proper one to submit.

Improve the github login UI

Currently the login page has small github icon that takes you for github authentication. This is not very intuitive, folks generally endup trying the github credential in the username/password and it obviously fails.

We need to either add text "Login with (github icon)" or something better to make it more intuitive.

environment variables are not picked in the client side code

Deployment of the UI app in the container (in kubernetes) mounts the .env file as a secret. All the environment variable defined in .env are exported as env variable in the container. Client side code is build statically, so it inlines the env variable values at the build time. But the container image is public through github actions with no .env files, and due to that the client side codes is statically build with empty values and that result in failure of some UI features.

We can locally build images with the relevant .env files for each deployment environment (qa, prod) and use those images for deployment, but it's too cumbersome and not good practice.

Other option is to ensure we always use app router or getServerSideOpt to get the environment variables as mention in the next.js documentation here.

Adjust error message

When clicking an action ("Download YAML", "Download Attribution", and "Download Skill/Knowledge) while the form is complete, the error message displayed including the first problematic field detected may be slightly confusing.
For example, "Please make sure all the email fields are filled!" would better be changed to, for example, "The email field is empty. Please make sure all the fields are filled!"

DCO is not signed for submission edits

Edits do not have a DCO and since the commit amend is a new tree we need to add one back.

Support Dynamic taxonomies (QNA document generators)

This is a new feature request.

Status Quo

Right now, taxonomies are defined by putting answers in qna.yaml files. This is human work and at times requires domain knowledge AND a good eye to spot typos and other mistakes. For some tasks, it may be beneficial to rely on a program to produce QNA to then feed it into the model for synthetic generation.

Some examples of tasks that could benefit from programmable approach to generate seed samples
:

formulaic transliteration between alphabets (e.g. Cyrillic-to-Latin: instructlab/taxonomy#312)
different data format conversion (e.g. extracting data from calendar events: instructlab/taxonomy#232)

Proposal

In addition to qna.yaml files directly stored on disc, also allow to define taxonomies as programs that, when executed, produce a qna.yaml document that complies with the Instruct Lab taxonomy format.

My attempt at implementing both cli and taxonomy bits for this feature (currently closed but I am happy to revive and rebase):

(old)

cli: instructlab/instructlab#489
taxonomy: instructlab/taxonomy#312

(new drafts - rebased old code against current main)

cli: instructlab/instructlab#1005
taxonomy: instructlab/taxonomy#768

Considerations

The final qna.yaml may or may not be stored in the taxonomy repo. (I'd prefer to not store it since it's directly derived from the program.)
The exact format for the program. I suggest to not assume a particular execution environment / language but use Dockerfiles / Containerfiles, plus defining some basic operational interface (e.g. how input can be fed into the container command entrypoint, and how resulting QNA document is returned from the program. One suggestion could be passing both through a volume mount.)
Since we are talking about executables from external contributors, security of the project should be considered: the programs should be validated through GitHub actions, but only when maintainers sanity-checked the contribution is safe; Dockerfiles should use "official" / "proven" base images.

Fix the name of the yaml generated by "Download Yaml"

Download yaml feature download the yaml and name it "knowledge_qna.yaml" or "skill_qna.yaml". User needs to rename these files to qna.yaml to submit the PR to taxonomy. It would be better to rename these files qna.yaml, so user doesn't need to rename it before pushing the PR. Sometime user forgets to rename, and the PR checks fails.

Add dev instructions of how to run the Markdown linter locally

I made changes to a MD file and found that the CI pipeline failed because of the Markdown linter. I didn't know how to run this locally on my machine which would make it easier to correct these linting issues.

Acceptance Criteria:

Add instructions of what tool and how to install the Markdown linter locally for devs to run checks before pushing remotely.

Streamline Document Review Process

Automatically pull commit information from knowledge document in taxonomy knowledge submission yamls, paste direct link to user created markdown as comment on PR.

Investigate passing documents to deepsearch

Currently we are discussing how we plan to pass the documents to deep-search for conversion. Currently were thinking provide both options for a sending a file as binary data, and having a URL but we need to communicate with Peter and co if this can work.
/cc @nerdalert

instructlab / ui Goto Github PK

ui's Introduction

InstructLab UI

Overview

Contributing

Community Meeting

Slack channel

License

ui's People

Contributors

Stargazers

Watchers

Forkers

ui's Issues

Status Quo

Proposal

Considerations

Recommend Projects

Recommend Topics

Recommend Org