Giter Club home page Giter Club logo

robby-chatbot's Introduction

๐Ÿ‘‹ Hi, I'm Yvann!

๐ŸŽ“ Student at 42 School, Angoulรชme | 18 Years Old

๐Ÿ˜ Interests:

  • Artificial Intelligence, especially Natural Language Processing and generative art/video generation ๐Ÿค–

๐Ÿ“ My Latest Article on Medium

๐Ÿ“ซ Get In Touch:

robby-chatbot's People

Contributors

chinesewebman avatar gabacode avatar rsjain1978 avatar sorydi3 avatar yvann-ba avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

robby-chatbot's Issues

Data upload

Great project. However, I can't get the data uploader to work properly. No matter the size of the csv, if I ask for the row count it always returns 4.

For example, I uploaded the mtcars.csv file from here. The bot says it has 4 rows. So I then asked it what the average value of the cycl variable was and the answer was 5.5. When I followed up with a request to tell me how it calculated that, the bot told me that it summed 4, 6, 4, and 8 and then divided by 4.
Screenshot 2023-04-25 at 4 57 11 PM

Not sure how to go about correcting this. My file I'd like to use is only 13.2mb.

Exceeds token limit

I tried a non-english pdf, and the token exceeds the limitation. I think it probably due to how sentence splitter works, since the period is not .

Always limited to four rows

Even trying with your sample on the hosted version this only lets you query four rows... and only thinks there are four rows total. Same was true of my own sample CSV

Screenshot 2023-04-22 173910@2x

Metadata backed context

Is there a way to give more context of the document through metadata information as to what each column means

Incorrect answer from Robby ;-)

Hello,

Based on the "fishfry-locations.csv" file, why the ChatBot returns only 3 restaurants when I ask the full list of restaurants?

If you look a the history of the discussion with the ChatBot, you will see that the answers are not really consistent. Any idea about the origin of this state?

Best Regards,

discussion

Limit the answers to the file only.

Right now, even if I ask questions unrelated (off topic questions), I will get a reply on that question since it's from chatgpt.

Expectations are to get a reply saying that it's not related to the topic.

I've done some testing to add the lines below to the QA_PROMPT (used by mayooear/gpt4-pdf-chatbot-langchain):
"If you don't know the answer, just say you don't know. DO NOT try to make up an answer.
If the question is not related to the context, politely respond that you are tuned to only answer questions that are related to the context. "

However, answer either changes drastically OR it no longer able to understand that it's taking context from a file.

I'm not sure whether I'm doing it right, or maybe is there any way to change how semantic search is being done?

Lastly, thank you so much for sharing this project. If there's any direction that I could look into, I will help experiment and contribute/share my findings.

Any ideas how to add button to upload specified file?

Hi. I was trying to add button to upload specified file from folder but for some reason it does not work. I want from app to have two ways to upload files. From folder ("Use Data from folder" button) and from the "drag and drop" box. The option with the "drag and drop" box works fine (it as alrady in), but for the option with the "Use Data from folder" button, chat appears once, but when I put a question in chat and click "send", chat disappears for some reason.

Here code for buttons from main app:

use_example_file = st.sidebar.button("Use Data from folder")
uploaded_file = utils.handle_upload(["pdf", "txt", "csv"], use_example_file)

if uploaded_file:
    [no changes here]

Where I changed handle_upload little bit:

@staticmethod
def handle_upload(file_types, use_example_file):
    """
    Handles and display uploaded_file
    :param file_types: List of accepted file types, e.g., ["csv", "pdf", "txt"]
    """
    
    if use_example_file is False:
        uploaded_file = st.sidebar.file_uploader("upload", type=file_types, label_visibility="collapsed")
    else: 
        # uploaded_file = use_example_file
        # use_example_file = st.sidebar.button(use_example_file)
        uploaded_file = open("example.csv", "rb")

    if uploaded_file is not None:

        def show_csv_file(uploaded_file):
            file_container = st.expander("Your CSV file :")
            uploaded_file.seek(0)
            shows = pd.read_csv(uploaded_file)
            file_container.write(shows)

        def show_pdf_file(uploaded_file):
            file_container = st.expander("Your PDF file :")
            with pdfplumber.open(uploaded_file) as pdf:
                pdf_text = ""
                for page in pdf.pages:
                    pdf_text += page.extract_text() + "\n\n"
            file_container.write(pdf_text)
        
        def show_txt_file(uploaded_file):
            file_container = st.expander("Your TXT file:")
            uploaded_file.seek(0)
            content = uploaded_file.read().decode("utf-8")
            file_container.write(content)
        
        def get_file_extension(uploaded_file):
            return os.path.splitext(uploaded_file)[1].lower()
        
        file_extension = get_file_extension(uploaded_file.name)

        # Show the contents of the file based on its extension
        #if file_extension == ".csv" :
        #    show_csv_file(uploaded_file)
        if file_extension== ".pdf" : 
            show_pdf_file(uploaded_file)
        elif file_extension== ".txt" : 
            show_txt_file(uploaded_file)

    else:
        st.session_state["reset_chat"] = True

    #print(uploaded_file)
    return uploaded_file

streamlit cloud server

Hello @yvann-hub ,

can you please write the steps of how to run this on streamlit cloud server?

it is only working locally.

Many thanks!

TypeError: issubclass() arg 1 must be a class

This error show up when i try to follow the step:

TypeError: issubclass() arg 1 must be a class
Traceback:

File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
File "/Users/nbmhqa068/GIT/Robby-chatbot/src/pages/2_๐Ÿ“Š Robby-Sheet (beta).py", line 7, in <module>
    from modules.robby_sheet.table_tool import PandasAgent
File "/Users/nbmhqa068/GIT/Robby-chatbot/src/modules/robby_sheet/table_tool.py", line 6, in <module>
    from langchain.callbacks import get_openai_callback
File "/usr/local/lib/python3.9/site-packages/langchain/__init__.py", line 6, in <module>
    from langchain.agents import MRKLChain, ReActChain, SelfAskWithSearchChain
File "/usr/local/lib/python3.9/site-packages/langchain/agents/__init__.py", line 2, in <module>
    from langchain.agents.agent import (
File "/usr/local/lib/python3.9/site-packages/langchain/agents/agent.py", line 16, in <module>
    from langchain.agents.tools import InvalidTool
File "/usr/local/lib/python3.9/site-packages/langchain/agents/tools.py", line 8, in <module>
    from langchain.tools.base import BaseTool, Tool, tool
File "/usr/local/lib/python3.9/site-packages/langchain/tools/__init__.py", line 10, in <module>
    from langchain.tools.bing_search.tool import BingSearchResults, BingSearchRun
File "/usr/local/lib/python3.9/site-packages/langchain/tools/bing_search/__init__.py", line 3, in <module>
    from langchain.tools.bing_search.tool import BingSearchResults, BingSearchRun
File "/usr/local/lib/python3.9/site-packages/langchain/tools/bing_search/tool.py", line 10, in <module>
    from langchain.utilities.bing_search import BingSearchAPIWrapper
File "/usr/local/lib/python3.9/site-packages/langchain/utilities/__init__.py", line 3, in <module>
    from langchain.utilities.apify import ApifyWrapper
File "/usr/local/lib/python3.9/site-packages/langchain/utilities/apify.py", line 5, in <module>
    from langchain.document_loaders import ApifyDatasetLoader
File "/usr/local/lib/python3.9/site-packages/langchain/document_loaders/__init__.py", line 42, in <module>
    from langchain.document_loaders.github import GitHubIssuesLoader
File "/usr/local/lib/python3.9/site-packages/langchain/document_loaders/github.py", line 37, in <module>
    class GitHubIssuesLoader(BaseGitHubLoader):
File "pydantic/main.py", line 299, in pydantic.main.ModelMetaclass.__new__
File "pydantic/fields.py", line 411, in pydantic.fields.ModelField.infer
File "pydantic/fields.py", line 342, in pydantic.fields.ModelField.__init__
File "pydantic/fields.py", line 451, in pydantic.fields.ModelField.prepare
File "pydantic/fields.py", line 547, in pydantic.fields.ModelField._type_analysis
File "pydantic/fields.py", line 648, in pydantic.fields.ModelField._create_sub_type
File "pydantic/fields.py", line 342, in pydantic.fields.ModelField.__init__
File "pydantic/fields.py", line 451, in pydantic.fields.ModelField.prepare
File "pydantic/fields.py", line 550, in pydantic.fields.ModelField._type_analysis
File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/typing.py", line 851, in __subclasscheck__
    return issubclass(cls, self.__origin__)

Env:
python3.9

How to better query the system

Hello there,

First of all, let me say that the project is off to a great start and the UI looks really good.

I was hoping to get some guidance on how to improve the prompt engineering for my local CSV data in order to generate better insights. Specifically, I noticed that the current approach could be improved, and I was wondering if you could suggest some alternatives that might be more effective.

I would greatly appreciate any guidance you could provide on this matter. Thank you for your time and assistance.

image

Reply "Thanks and have a nice day!"

Hello,

It's not an issue, just a question.

I would like that the chatbot replies "Thanks and have a nice day!" if the user writes "Thanks for your help or for the information or simply Thanks!". For that, I added some instructions in the qa_template variable, unfortunately that works only if the user enters one of these 3 sentences at the beginning of the discussion.

Do you know how to solve that? What's the best way to achieve this?

Thanks!

Error : Environment variable 'OPEN_API_KEY'.

Hello,

Thanks for your tuto, very interesting and helpful!

When I launch the script and after entering the information "API Key" and the file, I got this issue :

Uncaught app exception Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script exec(code, module.__dict__) File "/Users/jerome/Downloads/tuto_chatbot_csv.py", line 28, in <module> embeddings = OpenAIEmbeddings() File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__ pydantic.error_wrappers.ValidationError: 1 validation error for OpenAIEmbeddings __root__ Did not find openai_api_key, please add an environment variable OPENAI_API_KEYwhich contains it, or pass openai_api_keyas a named parameter. (type=value_error)

Any help would be very appreciated.

chatbot-error

Running into this error while starting the application locally

I just cloned the project and followed the readme and ran into this error can someone help me figure out this issue, I think there is some library incompatibility or I am using the higher version of python or streamlit

python: 3.11.0
streamlit: 1.22.0
OS: windows 11

  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "D:\internal-projects\Robby-chatbot\.venv\Scripts\streamlit.exe\__main__.py", line 7, in <module>
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\click\core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\streamlit\web\cli.py", line 201, in main_run
    bootstrap.load_config_options(flag_options=kwargs)
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\streamlit\web\bootstrap.py", line 342, in load_config_options
    config.get_config_options(force_reparse=True, options_from_flags=options_from_flags)
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\streamlit\config.py", line 1184, in get_config_options
    _update_config_with_toml(file_contents, filename)
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\streamlit\config.py", line 1062, in _update_config_with_toml
    parsed_config_file = toml.loads(raw_toml)
                         ^^^^^^^^^^^^^^^^^^^^
  File "D:\internal-projects\Robby-chatbot\.venv\Lib\site-packages\toml\decoder.py", line 433, in loads
    raise TomlDecodeError("Key group not on a line by itself.",
toml.decoder.TomlDecodeError: Key group not on a line by itself. (line 1 column 1 char 0)

Python version & package depdencies

Great project.

I noticed that when I used a conda environment with Python 3.7.4, i get an error while installing the 'tiktoken' package. I think you need to explicity specify the version which would work with 3.7.4

When I used conda with Python 3.8.x, I could install the packages correctly.

Raised a PR to update the readme file.

KeyError: 'langchain' (circular import error)

Hi. Could you help me please. From time to time, I am getting the following error:

2023-06-12 02:27:55.993 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in run_script
exec(code, module.dict)
File "C:\Users\v-alakubov\OneDrive\Desktop\app_v2\src\pages\AskData.py", line 7, in <module>
from modules.table_tool import PandasAgent
File "C:\Users\v-alakubov\OneDrive\Desktop\Listens\app_v2.\src\modules\table_tool.py", line 6, in <module>
from langchain.callbacks import get_openai_callback
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain_init.py", line 6, in <module>
from langchain.agents import MRKLChain, ReActChain, SelfAskWithSearchChain
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\agents_init_.py", line 2, in <module>
from langchain.agents.agent import (
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\agents\agent.py", line 16, in <module>
from langchain.agents.tools import InvalidTool
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\agents\tools.py", line 8, in <module>
from langchain.tools.base import BaseTool, Tool, tool
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\tools_init_.py", line 46, in <module>
from langchain.tools.powerbi.tool import (
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\tools\powerbi\tool.py", line 11, in <module>
from langchain.chains.llm import LLMChain
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\chains_init_.py", line 7, in <module>
from langchain.chains.conversational_retrieval.base import (
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\chains\conversational_retrieval\base.py", line 22, in <module>
from langchain.chains.question_answering import load_qa_chain
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\chains\question_answering_init_.py", line 13, in <module>
from langchain.chains.question_answering import (
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\chains\question_answering\map_reduce_prompt.py", line 2, in <module>
from langchain.chains.prompt_selector import ConditionalPromptSelector, is_chat_model
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\chains\prompt_selector.py", line 7, in <module>
from langchain.chat_models.base import BaseChatModel
ImportError: cannot import name 'BaseChatModel' from partially initialized module 'langchain.chat_models.base' (most likely due to a circular import) (C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\langchain\chat_models\base.py)
2023-06-12 02:27:56.013 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\v-alakubov\AppData\Local\anaconda3\envs\py311\Lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "C:\Users\v-alakubov\OneDrive\Desktop\Listens\app_v2\src\pages\AI-Chat.py", line 8, in <module>
from modules.utils import Utilities
File "C:\Users\v-alakubov\OneDrive\Desktop\Listens\app_v2.\src\modules\utils.py", line 6, in <module>
from modules.chatbot import Chatbot
File "C:\Users\v-alakubov\OneDrive\Desktop\Listens\app_v2.\src\modules\chatbot.py", line 3, in <module>
from langchain.chat_models import AzureChatOpenAI
File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
File "<frozen importlib._bootstrap>", line 1155, in _find_and_load_unlocked
KeyError: 'langchain'

any ideas why?

how to have chat before query not after (in 2_๐Ÿ“Š Robby-Sheet (beta).py)?

Hi. Currently 2_๐Ÿ“Š Robby-Sheet (beta).py has UI elements in the following order:

  1. Query
  2. Agent's thoughts
  3. Chat history
  4. Current dataframe

The problem with such a structure is that if a user asks a lot of questions, the chat history becomes too long, and to see each new answer, the user needs to scroll down and then go back to query. Is there a way to reorder it as:

  1. Chat history
  2. Query
  3. Agent's thoughts
  4. Current dataframe

I tried to do it on my own, but I am having a bug. When I provide my first query, no chat is shown. When I provide a second query, chat appears with the first query. If I provide a third chat, it appears with the first and second queries, and so on. I'm not sure how to fix it. My code:

if not user_api_key:
    layout.show_api_key_missing()
else:
    st.session_state.setdefault("reset_chat", False)

    uploaded_file = utils.handle_upload(["csv", "xlsx"])

    if uploaded_file:
        sidebar.about_()
        
        uploaded_file_content = BytesIO(uploaded_file.getvalue())
        if uploaded_file.type == "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" or uploaded_file.type == "application/vnd.ms-excel":
            df = pd.read_excel(uploaded_file_content)
        else:
            df = pd.read_csv(uploaded_file_content)

        st.session_state.df = df

        if "chat_history" not in st.session_state:
            st.session_state["chat_history"] = []
        csv_agent = PandasAgent()

        form_submitted = False

        if st.session_state.df is not None:
            if form_submitted:
                result, captured_output = csv_agent.get_agent_response(df, query)
                cleaned_thoughts = csv_agent.process_agent_thoughts(captured_output)
                csv_agent.display_agent_thoughts(cleaned_thoughts)
                csv_agent.update_chat_history(query, result)
                csv_agent.display_chat_history()

        csv_agent.display_chat_history()

        with st.form(key="query"):
            query = st.text_input("", value="", type="default", 
                placeholder="e-g : How many rows ? "
                )
            submitted_query = st.form_submit_button("Submit")
            reset_chat_button = st.form_submit_button("Reset Chat")
            if reset_chat_button:
                st.session_state["chat_history"] = []
            if submitted_query:
                form_submitted = True

        if form_submitted:
            result, captured_output = csv_agent.get_agent_response(df, query)
            cleaned_thoughts = csv_agent.process_agent_thoughts(captured_output)
            csv_agent.display_agent_thoughts(cleaned_thoughts)
            csv_agent.update_chat_history(query, result)
            # csv_agent.display_chat_history()

        if st.session_state.df is not None:
            st.subheader("Current dataframe:")
            st.write(st.session_state.df)

Maximum context length error

Hey,
Thanks for sharing your project.

I tried uploading a few csv files, though with all them I got the following error:
Error: This model's maximum context length is 4097 tokens. However, your messages resulted in 6317 tokens. Please reduce the length of the messages.

That's after uploading the file, and prompting "Hello".

Does your script chunk the uploaded csv file?

Thanks,
Yedidya

Main UI doesn't see the full CSV file

What is the difference in logic between the main chat textbox and agent processing? I loaded a ten line csv. When I ask both, the main box says the file has four rows. The agent (correctly) says ten. The main UI text box gets every query wrong. Why does it not see the full file contents?

Cannot use Azure OpenAI APIs with the chatbot

Awesome chatbot! I tried to add Azure OpenAI APIs did changes in chatbot.py:

from langchain.chat_models import AzureChatOpenAI
llm = AzureChatOpenAI(model_name=self.model_name, temperature=self.temperature, deployment_name = 'My_Test')

And also I add this into 1_๐Ÿ“„Robby-Chat.py:

    os.environ["OPENAI_API_KEY"] = user_api_key
    os.environ["OPENAI_API_TYPE"] = "azure"
    os.environ["OPENAI_API_BASE"] = 'https://my_test_chat.openai.azure.com/'
    os.environ["OPENAI_API_VERSION"] = "2023-05-15"

But I am getting Error message when try to ask the bot:
Error: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.