cheshire-cat-ai / core Goto Github PK
View Code? Open in Web Editor NEWProduction ready AI assistant framework
Home Page: https://cheshirecat.ai
License: GNU General Public License v3.0
Production ready AI assistant framework
Home Page: https://cheshirecat.ai
License: GNU General Public License v3.0
the BE should expose the available language model providers from the /settings/embedder endpoint. The FE will fetch the relevant information and allow the user to select and customise the language model using a JSON form.
When the cat retrieves relevant episodic memories (things user said in the past), it is important to:
If there is an embedder change, the VectorStore will not be compatible because:
So whenever the embedder changes:
Or if we want to preserve old memories, prepend embedder name to collection mane
Since everything you say or upload to the cat is vectorized and stored in a vector db.
It should be possible to upload and download documents in batch mode, such as from an ETL task.
Make ingestion running as standalone library to ingest memories from an external process.
It can be interesting to save the output/input not just in the database but also divided by user.
In this way in the UI it is possible to see previous chats and different user settings.
There is https://pypi.org/project/fastapi-users/ that already use sqlalchemy and add the various endpoints.
When I start the container, it shows:
frontend | โ Local: http://localhost:3000/
frontend | โ Network: http://172.19.0.3:3000/
But the IP address should be instead 192.168.1.4
Mac M1 - OSX 13.2.1
We want to enable Markdown support in our CheshireCat project. By design, most language models use Markdown in their responses, so we will be editing the MessageBox component using the Remark library located at https://github.com/remarkjs/remark. This will enable us to support markdown responses from the CheshireCat.
The gh-pages
branch contains sources for mkdocs documentation.
Github pages is already active and pointing to the site
folder in the branch from this address
Can you configure a pipeline to automatically build this docs when there is a push?
Before docs, let's have a few pages for the project directly on github pages.
Sorry, I was trying to understand the project's objective, but I couldn't find any description from the readme.
Just some thoughts as the code can be reorganized:
web
name to backend
(in this way is clear the different between the frontend
folder)looking_glass.py
that has a function CheshireCat
is not clearClean main.py
, to just execute the backend (and initialize stuff):
routes.py
)version.txt
, in this wat the version is not hardcodedOrganize differently the endpoints:
routes
with 2 files (now): settings.py and base.py (or a better name)Divide CheshireCat
in 2 different files, one is the cat and another is just the boostrap like for the db and so on.
Also in thinking about the plugin stuff everything in the code that calls external frameworks/libraries like for langchain should be handled by a dedicated class so it is more easy to extend it or change it. https://github.com/pieroit/cheshire-cat/blob/main/web/cat/looking_glass.py#L61
Probably the files in the config folder should explains better what is the purpose. I think that is better if they are in the db folder as they are just the models for those data.
We want to display the reasoning behind a response from the CheshireCat, to provide users with greater transparency and insight into the decision-making process. To do this, we will leverage the Sidebar component to present the content of the reasoning object, which is already sent from the backend. The reasoning object is defined as follows:
{
"input": "What is Python?",
"episodic_memory": [
{
"page_content": "it is for fictional purposes ",
"lookup_str": "",
"metadata": {
"source": "user",
"when": 1680432386.7730486,
"text": "it is for fictional purposes "
},
"lookup_index": 0,
"score": 0.5044264793395996
},
{
"page_content": "Write a 400 words post on how Ai is going to change the world",
"lookup_str": "",
"metadata": {
"source": "user",
"when": 1680432337.0415337,
"text": "Write a 400 words post on how Ai is going to change the world"
},
"lookup_index": 0,
"score": 0.5165414810180664
},
{
"page_content": "write the introduction of a novel that talks about how the world has been taken over by the AI",
"lookup_str": "",
"metadata": {
"source": "user",
"when": 1680432429.13744,
"text": "write the introduction of a novel that talks about how the world has been taken over by the AI"
},
"lookup_index": 0,
"score": 0.5386247634887695
},
{
"page_content": "nice, write a 670 words paragraph on who Ai will take over humanity",
"lookup_str": "",
"metadata": {
"source": "user",
"when": 1680432365.206487,
"text": "nice, write a 670 words paragraph on who Ai will take over humanity"
},
"lookup_index": 0,
"score": 0.5566583275794983
},
{
"page_content": "I am the Cheshire Cat",
"lookup_str": "",
"metadata": {
"who": "cheshire-cat",
"when": 1679948291.703731,
"text": "I am the Cheshire Cat"
},
"lookup_index": 0,
"score": 0.564825177192688
}
],
"declarative_memory": [
{
"page_content": "I am the Cheshire Cat",
"lookup_str": "",
"metadata": {
"who": "cheshire-cat",
"when": 1679948292.8870578,
"text": "I am the Cheshire Cat"
},
"lookup_index": 0,
"score": 0.564825177192688
}
],
"chat_history": "",
"output": "Python is a programming language used in various applications such as web development, data analysis, machine learning, and artificial intelligence.",
"intermediate_steps": []
}
Let's postpone a full fledged user management as proposed in #62 and go for a simple token auth:
AUTH_TOKEN
in the .env
all endpoints are publicAuthorization: Bearer <auth_token>
in the headerCurrently, the user can only add textual content to the cat by manually typing it in, which can be tedious and time-consuming. Therefore, we need to add a new feature that allows the user to add large amounts of textual content to the cat through a file uploader.
To implement this feature, we need to create an HTTP POST request that sends the selected file to the REST endpoint that accepts files. We also need to wait for the API response to ensure that the file was uploaded successfully. Once the file is uploaded, we will be able to process the content and add it to the cat.
This feature will provide users with an easy and efficient way to add textual content to the cat, which will improve the overall user experience
Qdrant is great but a little overkill for getting started with the Cat.
Systems like Django support any SQL db but start by simply shipping sqlite.
@umbertogriffo already introduced sqlite as a table db (merging soon!) and we'll do the same with the vector DB.
I checked out both FAISS and annoy and they do not support assigning metadata to vectors (which are essential to connect embeddings with symbolic stuff).
Langchain allows for a FAISS+Docstore combination but it has to be loaded and saved to disk manually, otherwise it only lives in memory. Search FAISS VectorStore here:
https://langchain.readthedocs.io/en/latest/reference/modules/vectorstore.html
Let's substitute Qdrant with FAISS while staying in the langchain constructs.
P.S.: compatibility of this pieces (LLM, Embedder, VectorStore) with langchain is a plus because the APIs are already well designed and helps developing a solid plugin system. Let's stick to them as much a spossible ;)
P.P.S.: conversation started in #23
Currently, it is not possible to upload documents with the .md extension, even though they are supported by the backend.
https://lysvz.localtonet.com/
something went wrong whhile connecting to the server. pleasetry agin later
Huggingface is the most popular hub for LLMs. Given the open source philosophy of the project I think integration and collaboration with the HuggingFace community could be great
[Front-end only] Link the available documentation link as well as the GitHub profile
Cohere exposes LLMs for free through their API, I think this could be very beneficial both for fast and free iteration during development, and to the users who cannot afford to use the OpenAI api
Hi there!
I have enjoyed looking at this project and appreciate your effort in making it functional and easy to use. However, I have noticed that there is no way to set up a local environment
using tools like pipenv
, which can help manage dependencies outside the Docker container.
I think it would be beneficial to have this option available because it would make it easier for developers to work on the project without relying solely on the Docker container. Additionally, it would allow for greater flexibility regarding the tools and versions of packages that developers can use.
Furthermore, I believe introducing code auto reformatting, PEP 8 checker (flake8), and auto sort imports (isort) with pre-commit
would be a good idea. This would help ensure that the codebase remains consistent and maintainable over time.
If you agree, I can introduce all of them in a PR.
Default prompts in chains and agents are in english.
OPTION 1: localization
There should be a language detector to classify user input and to load default prompts in the appropriate language.
(localized content should be organized in a similar manner as in CMSs like WordPress)
OPTION 2: wrapper
Run the language detector, translate user input in english using a translation model, run the pipelines/agents/chains in english and then translate back the final response back to user language.
Let's go with option 2 as llms are weak in languages other than english
Input bar for messages should autofocus as soon as a cat message is received
Hi Sorry if I am missing something, after last pull I got this error
The consolo throw this error
web | ERROR: Exception in ASGI application
web | Traceback (most recent call last):
web | File "/usr/local/lib/python3.9/site-packages/starlette/datastructures.py", line 702, in getattr
web | return self._state[key]
web | KeyError: 'ccat'
web |
web | During handling of the above exception, another exception occurred:
web |
web | Traceback (most recent call last):
web | File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 238, in run_asgi
web | result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
web | File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in call
web | return await self.app(scope, receive, send)
web | File "/usr/local/lib/python3.9/site-packages/fastapi/applications.py", line 271, in call
web | await super().call(scope, receive, send)
web | File "/usr/local/lib/python3.9/site-packages/starlette/applications.py", line 118, in call
web | await self.middleware_stack(scope, receive, send)
web | File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 149, in call
web | await self.app(scope, receive, send)
web | File "/usr/local/lib/python3.9/site-packages/starlette/middleware/cors.py", line 76, in call
web | await self.app(scope, receive, send)
web | File "/usr/local/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 79, in call
web | raise exc
web | File "/usr/local/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 68, in call
web | await self.app(scope, receive, sender)
web | File "/usr/local/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in call
web | raise e
web | File "/usr/local/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call
web | await self.app(scope, receive, send)
web | File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 706, in call
web | await route.handle(scope, receive, send)
web | File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 341, in handle
web | await self.app(scope, receive, send)
web | File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 82, in app
web | await func(session)
web | File "/app/./cat/routes/websocket.py", line 14, in websocket_endpoint
web | ccat = websocket.app.state.ccat
web | File "/usr/local/lib/python3.9/site-packages/starlette/datastructures.py", line 705, in getattr
web | raise AttributeError(message.format(self.class.name, key))
web | AttributeError: 'State' object has no attribute 'ccat'
And web console
We want to enable Markdown support in our CheshireCat project. By design, most language models use Markdown in their responses, so we will be editing the MessageBox component using the Remark library located at https://github.com/remarkjs/remark. This will enable us to support markdown responses from the CheshireCat.
Do you think it is feasible to define an attribute of "authoritativeness" to the uploaded documents, a score between 0 and 1 where 1 is the highest authoritativeness, and to be able to steer the Cat's choices in formulating responses, prioritizing information sources with priority 1.
This would allow the Cat, given equal available content, to choose the one with higher score.
It could be useful to weigh reliable sources from less reliable ones.
Currently, the CSS modules naming strategy used by VITE is not very flexible, which can make it difficult for external contributors and plugin creators to customise the CSS classes. This can be especially problematic when trying to apply custom styling or override existing styles
Given that we are using VITE as a bundler, we can take advantage of its built-in support for CSS modules and its various configuration options. Specifically, we can explore the css.modules option in VITE, which provides several options for customizing the CSS modules behavior, including the ability to specify custom naming conventions and post-processing steps.
For more information on how to configure CSS modules in VITE, see the official VITE documentation.
It's useful to have inside CheshireCat a way to summarize text/documents before saving embeddings.
We can add a new hook in the default plugin and make it available in CheshireCat class. Something like this maybe:
def load_plugins(self):
...
self.embedder = ...
self.summarizer = self.mad_hatter.execute_hook("get_language_summarizer", self)
...
We can use Hugging Face maybe for the default one? let me know :)
At the moment the frontend codebase works with .env files, which is not suitable for the current monolithic architecture.
Hence, we must remove the .env files from the frontend repository and modify the codebase to reflect this modification.
Even though this approach is not optimal in the long-run, it is necessary for the MVP version to keep the application as basic yet functional as possible
The cat should be able to guide the user on how to extend and hack the system.
This info can be inserted in declarative memory and retrieved in conversation (HyDE should be enough)
Cat running in a Virtualbox VM ubuntu-20.04.6-live-server-amd64.iso, hosted by a Windows 10 Enterprise PC
Web interface starts on :3000
Message "Getting Ready" and then red banner telling "Something went wrong while connecting to the server. Please try again later"
The only warning message on the log window is
web | /usr/local/lib/python3.9/site-packages/langchain/llms/openai.py:608: UserWarning: You are trying to use a chat model. This way of initializing it is no longer supported. Instead, please use: `from langchain.chat_models import ChatOpenAI`
web | warnings.warn(
I'm not sure if it has anything to do with the problem.
maybe it is something super simple, but the message is not much helping. For example I'm not sure which server is not reachable. A more detailed message, at least on the log, with server name/address and port would help in troubleshooting.
Use more deeply langchain routines to keep the prompt at limited length (CombineDcoumentsChain etc.).
Summarization may also be appropriated when documents are uploaded.
We are pigging back on a langchain adapter in order to support Cohere.
The default model large
for text generation hallucinates:
Also, on a separate notebook:
the BE should expose the available language model providers from the /settings/llm endpoint. The FE will fetch the relevant information and allow the user to select and customise the language model using a JSON form. Once the user saves the information should be sent to the BE
So right now in the output in the shell we have:
Prompt after formatting:
You will be given a sentence.
If the sentence is a question, convert it to a plausible answer. If the sentence does not contain an question, repeat the sentence as is without adding anything to it.
Examples:
- what furniture there is in my room? --> In my room there is a bed, a guardrobe and a desk with my computer
- where did you go today --> today I was at school
- I like ice cream --> I like ice cream
- how old is Jack --> Jack is 20 years old
- Does pineapple belong on pizza? -->
This can be confusing, with a flag (in the .env
file) it could be possible to print different content.
Like also create an error.log with the python errors or different more.
/home/www/cheshire-cat/web/env/lib/python3.11/site-packages/langchain/llms/openai.py:608: UserWarning: You are trying to use a chat model. This way of initializing it is no longer supported. Instead, please use: `from langchain.chat_models import ChatOpenAI`
warnings.warn(
As another example:
This is a conversation between a human and an intelligent robot cat that passes the Turing test.
The cat is curious and talks like the Cheshire Cat from Alice's adventures in wonderland.
The cat replies are based on the Context provided below.
Context of things the Human said in the past:
- I am the Cheshire Cat
Context of documents containing relevant information:
- I am the Cheshire Cat
Conversation until now:
Human: What's up?
What would the AI reply? Answer concisely to the user needs as best you can, according to the provided recent conversation and relevant context.
If Context is not enough, you have access to the following tools:
> my_shoes: Retrieves information about shoes
> my_shoes_color: Retrieves color of shoes
To use a tool, please use the following format:
This kind of output should be in the UI and not in the server.
As a user, I want to be able to input my own OpenAI API key at app startup once and for future sessions.
This will require creating:
Once the API key is saved, the backend should retrieve it from the database when needed so that the key is never exposed to the front end app
The Cat features a langchain ConversationalAgent which covers most easy use cases:
At the moment this solution presents several limits:
Here is a roadmap to improve the Agent:
1 - have a more agile and resilient default agent (choosing among [these])(https://python.langchain.com/docs/modules/agents.html)
2 - have a pluggable agent - inserting a few hooks into CustomAgent, making it multiprompt and working on dictionaries
3 - having a hook to totally override the cat Agent in case a dev is brave enough (same as we do with LLM and embedders)
@nicola-corbellini @sirius-0 let's tackle this
As a user, I would like to interact with the Cheshire Cat using voice.
This requires adding voice input capability to our web application.
The recorded audio will then be transcribed into text using the Web Speech API browser API.
The resulting text will be sent to the backend for processing.
To provide a smooth user experience, we aim to implement a WhatsApp-like UX interaction for the Cheshire Cat.
This will allow users to easily interact with the cat using their voice directly and see the corresponding text responses in real-time
Leave this to core contributors
The backend should provide a list of available plugins through the /plugins endpoint. This list will include the plugin's name, description, and a unique id. The unique id may simply be the name of the folder that the plugin is stored in. To allow end users to define plugin metadata, the suggested approach is to have a non-mandatory plugin.json file stored in each plugin's directory where the user can define both name and description (as well as future metadata such as the JSON schema of the configuration).
// plugin.json
{
"name": "MyCustomPlugin",
"description": "Makes the cat cool af"
}
if the plugin.json file is not defined then the backend should default to the values from the folder name.
A possible response from the /plugins endpoint will then be:
[
{
id: "cool-plugin",
name: "MyCustomPlugin",
description: "Makes the cat cool af"
},
]
The front end should fetch the list of available plugins and display them under the /plugins route as a read-only list. Create a new pluginsSlice using redux and follow the defined best practice on how to handle async states. At the moment, no interaction is scheduled.
Could be useful to enable to upload CSV files?
agent prompting should be customizable from API endpoints (and after that from the user interface)
The first time you open the cat there is a set of possible prompts.
At later interactions it would be interesting to see frequent prompts (which are relevant to users)
Hi Piere, thanks for this proyect... is wonderful.
Look I have the netxt error in the UI
But in the cmd terminal I see the correct answer
The terminal throw the next error:
web | Traceback (most recent call last):
web | File "/app/./cat/main.py", line 57, in websocket_endpoint
web | cat_message = cheshire_cat(user_message)
web | File "/app/./cat/looking_glass.py", line 167, in call
web | episodic_memory_content = self.recall_memories_from_embedding(
web | File "/app/./cat/looking_glass.py", line 114, in recall_memories_from_embedding
web | memories = self.memory[collection].similarity_search_with_score_by_vector(
web | File "/usr/local/lib/python3.9/site-packages/langchain/vectorstores/faiss.py", line 151, in similarity_search_with_score_by_vector
web | scores, indices = self.index.search(np.array([embedding], dtype=np.float32), k)
web | File "/usr/local/lib/python3.9/site-packages/faiss/class_wrappers.py", line 329, in replacement_search
web | assert d == self.d
web | AssertionError
And Chat GPT4 says:
The error you are experiencing appears to be an AssertionError caused by a discrepancy in the dimension of the embedding vectors in the application you are using. Here's an explanation of the error in detail:
The solution to this issue will depend on the underlying cause of the discrepancy in the dimensions of the embedding vectors. Here are some ideas for troubleshooting the issue:
If after investigating these aspects you still cannot resolve the issue, consider reaching out to the developers of the application or the Faiss library for assistance or to report a possible bug.
After playing with the cat for a while, looks like the context memory gets full.
The frontend message is
"Something went wrong while sending your message. Please try refreshing the page"
but the backend log reports
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 12341 tokens. Please reduce the length of the messages.
I am not in the code yet enough to propose a pull but from the user perspective my suggestion are
We moved to FAISS in #39 to make setup easier, but (my bad!) file based vectorstore do not allow prefiltering on metadata.
This means that the Cat cannot filter memories by metadata, which is key in many use cases.
With FAISS we are forced to get a lot of nearest neighbors in the hope they contain the correct metadata, then filter.
With Qdrant the neighbor search can be directly metadata-driven.
Let's get back Qdrant (on its own container) as in earler versions of the Cat
Anything you say or upload to the cat is vectorized and store in a vector db.
It should be possible to read and delete memories via endpoints, since any vector has metadata on its source.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.