Giter Club home page Giter Club logo

coronasafe / ayushma Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 8.0 1.12 MB

Empowering Nurses with Multilingual ICU Protocols. Leveraging the rapid advancements in AI technology, created multilingual interfaces that assist nurses in rapidly upgrading their knowledge about ICU protocols.

Home Page: https://ayushma-api.ohc.network

License: MIT License

Procfile 0.10% Python 94.61% Dockerfile 1.92% Shell 2.62% Makefile 0.49% HTML 0.26%
ai gpt-4 rag

ayushma's People

Contributors

aeswibon avatar ashesh3 avatar bodhish avatar dependabot[bot] avatar gigincg avatar gokulramghv avatar ishanextreme avatar khavinshankar avatar mathew-alex avatar pranshu1902 avatar rithviknishad avatar sainak avatar siddnikh avatar skks1212 avatar suprabathk avatar thedevildude avatar vigneshhari avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ayushma's Issues

Allow Admins to set the GPT model for Project

Currently model is set to default as GPT 4

Allow Admins to use different models on different Projects.

Options should contain all models available to be used.

  • If the backend is set to use Azure models, make sure the options change accordingly.

Allow API key overrides to /chat and /converse endpoints

As Ayushma will be open to all, it is not viable to have everyone be able to use our Open AI api keys. We should only allow a set number of users to use Ayushma's OpenAI key.

So, add a boolean field called allow_key (or any other name that you think would be more fit) with default false to the user model. Allow /chat and /converse endpoints to have an open_ai_key parameter to be sent in the request headers. Now check if open_ai_key is present in the request headers.

  • If it is, use that key.
  • If the key is not present in the request headers, and the allow_key value is True on the requesting user, use the default OPEN_AI_KEY.
  • If the key is not present in the headers, and the allow_key is False, respond with an error message asking for an open_ai_key

Include citations in all Ayushma responses.

Include citations in all responses. These will come from the relevant Documents associated with the metadata in vectors.
Return these citations as JSON in the /converse api

NOTE

Do not work until #64 gets merged, or you can create a single PR for both issues if you feel so.

Add support for Azure OpenAI endpoint

Currently we are using the official OpenAI's API base, api.openai.com. However, we will also be deploying this application with Azure OpenAI Service. Hence, we need configurability for the below options:

- openai.api_type
- openai.api_base # For Azure, it will be in the format: xxxxxx.openai.azure.com
- openai.api_version

Just being able to set them in the base settings file (using the .env file) and then using these settings across the project should be enough. The user need not have access to these settings for now.

Query embeddings from /chats/<chat>/converse api to get cosine similarity results from pinecone Index.

Now that we have upsert and getting the embedding from prompts in place, let us send the embeddings to our Pinecone Index (specified in PINECONE_INDEX env variable) to get the cosine similarity result. Once we get that, send the data to GPT via langchain (I have little understanding of Langchain at the moment, so I will encourage you to use the docs to find out best practices), and then return the result as response.

Add Documentation for all Django Variables

We currently have documentation of variables. This does not include variables like django environment, allowed_hosts, database url etc. important to setup the environment in production.

Add these variables to the documentation

Require a namespace when creating a chat

Currently, we require a namespace value in the incoming data at the /converse endpoint. Remove this requirement and add it to the /chats POST endpoint. (you can update the serializer). Update the Chat model to store the namespace value. Now, in the /converse endpoint, fetch the namespace from the Chat object.

Add APIs for /register, /forgot-password, /reset-password, /verify-forgot-token

Register should accept username, full name, email, password

Forgot password should accept email. If email is correct, send an OTP and reset link to the user. Reset link would be something like
https://ayushma.ohc.network/reset-password?otp=[otp]&user_id=[user_id]
Create a new model ForgotToken for this that will contain user, expiry (should be 10 minutes from request), OTP
Auto expire previous OTPs if a new one is created

Reset link will send a request to /verify-forgot-token containing OTP and user_id that will check if the otp is correct and not expired. Only process if both OTP and user_id are present and correct. Return the user's email, username and name if correct.

The /reset-password api will have two usecases.

  1. if authorization header is present, only expect a password, and update password for the user.
  2. If authorization header is not present, fallback to accepting OTP and user_id, perform the same validations as /verify-forgot-token and update the password.

Get in touch with @mathew-alex regarding setting up emails

Upgrade sending references

A document will have many paragraphs.
You create embedings for each line.
Pinecone will send the top 100 matching lines.
lines will have the metadata of which paragrah its part of
You do a group_by on the paragraph_ids.
take the top 5 para and their nearby paras
pass this to gpt.

Random Test Failures Due to OpenAI Rate Limit

During our routine tests, we've been noticing that at least one question fails in each test run, reducing the overall average. The issue seems to be occurring randomly and is not reproducible in local environments, leading to the suspicion that it may be related to issues on deployment or OpenAI's side.

Details:

From analysis of the celeryworker.log and the information available in the admin panel, it seems that the failures might be due to hitting the rate limit set by OpenAI. The error message reads: "Rate limit reached for 10KTPM-200RPM in organization org-xxxxx tokens per min. Limit: 10000 / min"

Currently, langchain already has a retry mechanism in place, where it retries a failed request in 2^x seconds intervals if it's a rate limit issue. However, this doesn't seem to be sufficient to handle the current situation.

Proposed Solution:

Considering we don't need test results urgently, we could increase the maximum wait time for retries, say up to 3 minutes. This should provide a larger buffer to avoid hitting the rate limit and might potentially resolve the issue.

Next Steps:

  1. Investigate if the current langchain configuration allows to modify the maximum wait time for retries.
  2. If possible, adjust the maximum wait time to 3 minutes and monitor if this reduces or eliminates the random test failures.

Implement a testing framework

We need to automate our testing, which is currently happening through google sheets.

An admin can create a TestSuite, and TestQuestions under it. The TestSuite can be configured to have different temperature and topk

A TestQuestion will contain a question that will be asked to Ayushma, and an answer that will have a human entered answer for the question. We need to test if Ayushma's response is similar to the answer and how similar.

Once the admin triggers a test run, a new TestRun instance will be created linked to a Project and the TestSuite. The suite will run async through celery.
The test will perform each associated TestQuestion and create a TestResult which will have the question and human_answer from the TestQuestion (do not link the models, because the questions can change), answer that was returned by Ayushma, the timetaken, cosine_sim and bleu_score

Once all questions have been answered, we need to calculate the cosign sim and bleu score for the result on the scale of 0 to 1. After this, the test will be over and the admin can see the results.

Now, update the user model to have an is_reviewer field. If a user has is_reviewer to be true, they can access the TestRuns and their TestResults and add feedback to the results. They will be able to create a new Feedback for a TestResult by entering a rating (Excellent, Good, Satisfactory, Unsatisfactory, Wrong or Hallucinating) and a note.

The Reviewer can only see their own feedback. They cannot edit them later.
Only the Admins can see the Feedbacks of all reveiwers.

In the end, your new models should look like this (Models will be extending the base model class)

TestSuite

  • name
  • temperature
  • topk

TestQuestion

  • test_suite (fk)
  • question
  • human_answer

TestRun

  • test_suite (fk)
  • project
  • complete (default false)

TestResult

  • test_run (fk)
  • test_question (fk)
  • question
  • human_answer
  • answer
  • cosine_sim
  • bleu_score

Feedback

  • test_result (fk)
  • rating (integer choice field)
  • notes

cc. @bodhish

Use Open AI Ada to convert input text to vector embeddings

Create a POST API endpoint called chat that accepts a text string, sends an API request to OpenAI Ada to convert into vector embeddings, and return the vector embeddings as a response (for now)

Take the OpenAi key from the environment with the name OPENAI_API_KEY. Edit core/settings/base.py to include the Environment variable. Default it to a blank string

For reference: https://platform.openai.com/docs/guides/embeddings/what-are-embeddings

Upsertion endpoint

Currently we are upserting data through python manage.py upsert. Let us create an endpoint to upsert.

First, create a new Project model, that will contain title, description and user. Inherit from Base Model. Now create relevant serializers and viewset for CRUD operations.

Create a Document model. A Document belongs to a Project (many to one). It should contain title, description and file. Create relevant serializers and viewsets for CRUD. Make sure that the POST method accepts formdata. File cannot be updated, only title and description can.

the url should look like /projects/<project_id>/documents/<document_id>

Now once a file gets uploaded, perform the same upsert logic as present, but use the external_id of the Project as the namespace for Pinecone.

Create singular function to get embeddings from Ada.

Currently we are getting embeddings during upsert as well as in the /chat api through a function. Replace the logic of getting embeddings during upsert to also use this function.
Update the logic in the function to convert input into chunks

Set language to the chat message, not the chat object.

Currently we are keeping the same language for the entire chat object. Rather than that, we should send a language with the converse endpoints. Using the language provided, Ayushma will process the query and store the language in the ChatMessage object.
Also store the original English response of the LLM along with the translated one. Alter the serializers to send the original response along with the translated one.

Make a subsequent FE PR to show the english response over the translated one if toggled.

References get translated and don't work

  • Translate after extracting references
  • Stream after extracting references (if possible)
  • Generate audio after extracting references
  • Prevent duplication of references
  • Fix Reference URL in FE

FAQ upsert

Create a new upsert algo which will upser Question and asnswers to vectorDB

  • Question and its asnwer will be upserted to a single vector
  • The input file will be a json
  • Use top 5 matching answers
  • Create a new prompt that knows its an faq doc
  • Make prompt more specific

Show references in test results

Test results do not show references. Update models to also contain the references, and update the UI to show the references too.
image

Add more metrics for evaluating test runs

The following metrics for comparing responses would be best:

  1. GLOVE + Cosine Similarity
  2. BertScore: compute token similarity using contextual embeddings
  3. Glove + Word2vec + BiLSTM : word embedding is first made with Glove and Word2Vec, two BLSTM networks are used separately for sentence embedding, these are passed through a classifier
  4. Word mover Distance
  5. Pretrained sentence encoders: Such as Google Sentence encoder
  6. Siamese Manhattan LSTM

cc. @bodhish

Ayushma: Add Feedback

Users should be able to provide feedback to the chat.
This should be handled by a Feedback model, which has

  • chat_message : fk to ChatMessage
  • user : fk to User
  • liked : boolean
  • message : (optional) text

Make relevant views and serializers. Feedback should be listed with /feedback url, with possible filters -

  • project_id
  • chat_id
  • chat_message_id
  • liked

Only admins should be able to list and retrieve all feedback. Normal users can only list their own. Feedbacks cannot be deleted.

Store the incoming text at /converse endpoint in the form of ChatMessage object

Store the incoming text at /converse endpoint to create a new ChatMessage object (model is already defined). Now create a new serializer called ChatDetailSerializer that gets used only while retrieving a chat (through /chats/<external_id>) that adds a chats field which is a reference to ChatMessageSerializer (which you also have to create) that is a model serializer for ChatMessage.

Now once this is done, update the /converse endpoint such that the previous messages also get chained through langchain.

Archiving Projects

Projects that have served their purpose and are now no more needed should not be deleted, but rather be able to be archived. New Chats and Documents cannot be made on an archived project.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.