daethyra / build-ragai Goto Github PK

AI-powered Python components and notebooks for leveraging Large Language Models from OpenAI and Hugging Face.

License: Other

Python 1.47% Jupyter Notebook 98.53%

ai artificial-intelligence artificial-intelligence-projects components framework huggingface jupyter-notebooks langchain langserve langsmith large-language-models machine-learning natural-language-processing openai prompt-examples prompt-template python rag retrieval-augmented-generation transformers

build-ragai's Introduction

Hi there 👋 I'm Daethyra (pronounced: duh-thear-uh)

A bit about me...

🏳️‍⚧️ Pronouns: she/her
🔭 I’m currently working on transitioning from cybersecurity and into software development.
🌱 I’m currently building FreeStream, a Streamlit Multi-Page App with various Chatbots for many use-cases.
👯 I love collaborating on open-source projects with a vision to make peoples' lives better.
🤔 More specifically, I want to build cool stuff for others that automates the mundane.
💬 Ask me about my favorite video game, or which games I'm playing recently.
⚡ Fun fact: I love Star Wars! Guess my favorite trilogy.

Tech Stack

Notable Projects

FreeStream

Description: A web application where you can access Claude Opus and GPT-4 for free, making use of the different chatbot architectures I've set up. The first example is focused on retrieval augmented generation and requires you to upload files for the AI to generate answers from. The second chatbot, so far, is a general-purpose chatbot, and the benefit to using FreeStream is that there's no chat-length limits and you can "drop-in" your choice of large language model from foundational model providers OpenAI, Anthropic, and Google.

Build-RAGAI

Description: A collection of Jupyter Notebooks and Python components to leverage LangChain, OpenAI, and Transformers for building generative AI applications, providing reusable code snippets, tutorials, and end-to-end examples.

build-ragai's People

Contributors

Stargazers

Watchers

Forkers

techfluent-au hafez-cs

build-ragai's Issues

Correct logical errors in all LLM-Utilikit modules

List:

Add notebook: privacy-rag-over-code

Subdir: "notebooks": unchecked

Must ensure all notebooks in langchain/ subdir actually work

Add ref and explanation of llm_utilikit directory

README.md is unfinished

Requires a reference and explanation for each sub directory of llm_utilikit

Add 'awesome' lists

Already collected, @Daethyra: look in personal "Building AI" list to find the right ones.

Integrate ABR repo

Move all code, abandon repository

Need README for docs/jupyter_notebooks

https://github.com/Daethyra/LLM-Utilikit/tree/v1.0.21/docs/jupyter_notebooks

Pyproject.toml missing requirements

Use PDM to add safetensors and
fk I forgot the other one

Suggestion for `query_local_docs.py` code refactorization

! Contains hallucinated code !

Minimum one instance:
-> from langchain.retrievers import VectorStoreRetriever

import os from typing import List, Tuple from langchain.document_loaders import PyPDFLoader from langchain.text_splitters import RecursiveCharacterTextSplitter from langchain.embeddings import OpenAIEmbeddings from langchain.vectorstores import Chroma from langchain.retrievers import VectorStoreRetriever from langchain.chat_models import ChatOpenAI from langchain.schema.output_parser import StrOutputParser from langchain.schema.runnable import RunnablePassthrough from langchain.prompts import ChatPromptTemplate class DocumentRetrievalChatbot: def __init__(self, pdf_directory: str, persist_directory: str = "./chroma_db"): self.pdf_directory = pdf_directory self.persist_directory = persist_directory self.db = self._initialize_chroma_db() self.retriever = VectorStoreRetriever(self.db) self.chat = self._initialize_chat_model() def _initialize_chroma_db(self): loader = PyPDFLoader(self.pdf_directory, recursive=True) documents = loader.load() text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000) docs = text_splitter.split_documents(documents) embedding_function = OpenAIEmbeddings() db = Chroma.from_documents(docs, embedding_function, persist_directory=self.persist_directory) return db def _initialize_chat_model(self): output_parser = StrOutputParser() template = ChatPromptTemplate() chat = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0) return chat def get_responses(self, query: str, top_k: int = 5) -> str: retrieved_docs = self.retriever.retrieve(query, top_k=top_k) responses = [] for doc in retrieved_docs: response = self.chat.generate_response(query, doc.page_content) responses.append(response.text) return " ".join(responses) def run_query_loop(self): while True: query = input("Enter your query (or 'q' to quit): ") if query.lower() == "q": break response = self.get_responses(query) print("Response:", response) if __name__ == "__main__": pdf_directory = "data/" bot = DocumentRetrievalChatbot(pdf_directory) bot.run_query_loop()

Sift through Gists

I collected my Gists that I believe would be helpful in tweaking the current file base of AA4LLM

Gists-LLM-Utilikit-Free_Code_Examples.zip

Copy/Credit this prompt CSV

I'll need to personally split commands based on role

https://github.com/f/awesome-chatgpt-prompts/blob/main/prompts.csv

Check my Gists for more useful notebooks

Assistant Architect Hallucination - Resolve: LangChain - Tracing

Collaboration preparation

Need

issue template
'collaborating' markdown doc

integrable_image_captioner.py

Leftover work: integrable_image_captioner.py:

Code Improvements:

Unused Imports:

Modules like json and dotenv are imported but not used. Consider removing them if they are not necessary.

Global Variable Usage:

The usage of config['ENDING_CAPTION'] in the main function seems incorrect. You might need to use ending_caption instead, as it is the variable holding the relevant environment variable value.

CSV File Handling:

In save_to_csv, you open csvfile as a new file every time. If csvfile is passed as an argument, it should be used instead of opening a new file. Also, consider parameterizing the write mode ('a' for appending) to allow for more flexibility.

Error Handling in Main:

While you have a try-except block, it might be useful to continue processing other images even if one fails. Consider moving the try-except inside the loop to handle errors on a per-image basis.

Entire subdir: unchecked | requires meticulous review

fine_tune_sequence_classification_model.py

AA4LLM Document Contents Review

Double check "continued education" directory for useful learning

define, "useful"

AA4LLM | Add Agents, Toolkits: In LangChain overview doc: describe `(["agents","tools",])`

ST Prompt 1 - User Role

{} = replace_me

"What is the idiomatic way to {do the thing you want to do}
in {language in question}?"

Idea for prompt: YouTube video summarization

Please watch this video and summarize its main points in bullet points. Use clear and concise language. Provide relevant examples and explanations from the video. Include the following information:

The topic and purpose of the video
The main arguments or claims made by the speaker
The evidence or support provided for the arguments or claims
The speaker’s tone and attitude towards the topic
The intended audience and message of the video

video = "https://youtu.be/2F9itktands?si=DXyOmSHePtEip_lO"

Review Blind Programming subdirectory

Review all Transformers code for merge-ability

I'm not sure what code should be carried over. A lot of the repo's contents are monolithic in nature, composing excessive amounts of functionalities in a few classes per file.
Since this repo is moving to a more simple, understandable structure, I will likely not make use of much of the codebase. Not only does most of it not work, but most of it is also poorly written with no clear vision in mind.

Needs directory organization

needs folder organization for
-- multi-shot examples
-- user examples
-- system examples

Pull prompt ideas from this article

Click here

Reviewed article and copied prompts on 11-28-23

`query_local_docs.py` does not return LLM Response | Add LangSmith tracing v2

return LLM query-response
Add LangSmith tracing v2 via .env

Release 1 requirements

Before release 1:

Run PDM init and ensure the pyprojecttoml looks good.
Take all of the prompts of all role types and move them into a single master sheet, still separate from the cheatsheet

Subdir: "codesnippets": unchecked

Brackets to control "role"-type

See the prompt quoted below:

[Note regarding future user inputs]:"""
I will use brackets, '[]' to specify either a literal command(like, [PROCEED]) OR a context-conveyor followed by a string(like, [User message]:"Sample text").
"""

[PROCEED]

Review all LangChain code for merge-ability

flake8 results

Traceback for "flake8 .\src\llm_utilikit\langchain":

.\src\llm_utilikit\langchain\codesnippets\bufferwindow_memory.py:1:1: F401 'langchain.memory.ConversationBufferWindowMemory' imported but unused
.\src\llm_utilikit\langchain\codesnippets\bufferwindow_memory.py:8:80: E501 line too long (98 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\bufferwindow_memory.py:11:80: E501 line too long (84 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\bufferwindow_memory.py:15:80: E501 line too long (95 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\bufferwindow_memory.py:21:14: F821 undefined name 'langchain'
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:3:1: F401 'langchain.prompts.chat.ChatPromptTemplate' imported but unused
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:3:1: F401 'langchain.prompts.chat.HumanMessagePromptTemplate' imported but unused
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:3:1: F401 'langchain.prompts.chat.SystemMessagePromptTemplate' imported but unused
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:24:80: E501 line too long (95 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:25:80: E501 line too long (108 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:26:80: E501 line too long (124 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:27:80: E501 line too long (93 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:33:80: E501 line too long (104 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:36:80: E501 line too long (98 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:37:80: E501 line too long (94 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:38:80: E501 line too long (83 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:40:80: E501 line too long (124 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:45:80: E501 line too long (89 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\chatopenai.py:46:80: E501 line too long (93 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\multi_queryvector_retrieval.py:5:80: E501 line too long (81 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\multi_queryvector_retrieval.py:8:80: E501 line too long (85 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\multi_queryvector_retrieval.py:10:80: E501 line too long (118 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\multi_queryvector_retrieval.py:11:80: E501 line too long (135 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\multi_queryvector_retrieval.py:14:80: E501 line too long (95 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\multi_queryvector_retrieval.py:17:80: E501 line too long (90 > 79 characters)
.\src\llm_utilikit\langchain\codesnippets\multi_queryvector_retrieval.py:40:51: F821 undefined name 'question'
.\src\llm_utilikit\langchain\end2end\chatbots\streamlit\st_chat.py:15:80: E501 line too long (87 > 79 characters)
.\src\llm_utilikit\langchain\end2end\chatbots\streamlit\st_with_memory.py:1:80: E501 line too long (82 > 79 characters)
.\src\llm_utilikit\langchain\end2end\chatbots\streamlit\st_with_memory.py:2:1: F811 redefinition of unused 'StreamlitChatMessageHistory' from line 1
.\src\llm_utilikit\langchain\end2end\chatbots\streamlit\st_with_memory.py:20:80: E501 line too long (86 > 79 characters)
.\src\llm_utilikit\langchain\end2end\rag\faiss_retriever.py:2:80: E501 line too long (99 > 79 characters)
.\src\llm_utilikit\langchain\end2end\rag\faiss_retriever.py:3:80: E501 line too long (102 > 79 characters)
.\src\llm_utilikit\langchain\end2end\rag\faiss_retriever.py:4:80: E501 line too long (100 > 79 characters)
.\src\llm_utilikit\langchain\end2end\rag\faiss_retriever.py:5:80: E501 line too long (93 > 79 characters)
.\src\llm_utilikit\langchain\end2end\rag\faiss_retriever.py:15:1: F401 'langchain.agents.agent_toolkits.create_conversational_retrieval_agent' imported but unused
.\src\llm_utilikit\langchain\end2end\rag\faiss_retriever.py:15:80: E501 line too long (81 > 79 characters)
.\src\llm_utilikit\langchain\end2end\rag\faiss_retriever.py:34:44: F821 undefined name 'memory_key'
.\src\llm_utilikit\langchain\end2end\rag\faiss_retriever.py:34:60: F821 undefined name 'llm'
.\src\llm_utilikit\langchain\end2end\rag\pinecone\application.py:9:80: E501 line too long (85 > 79 characters)
.\src\llm_utilikit\langchain\end2end\rag\pinecone\application.py:15:80: E501 line too long (88 > 79 characters)
.\src\llm_utilikit\langchain\end2end\rag\pinecone\application.py:22:80: E501 line too long (85 > 79 characters)
.\src\llm_utilikit\langchain\end2end\rag\pinecone\application.py:29:80: E501 line too long (83 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:16:80: E501 line too long (89 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:36:80: E501 line too long (133 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:37:80: E501 line too long (92 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:49:80: E501 line too long (86 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:56:31: F821 undefined name 'retry_if_value_error'
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:64:80: E501 line too long (85 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:67:80: E501 line too long (88 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:122:46: F821 undefined name 'hub'
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:122:80: E501 line too long (82 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:142:80: E501 line too long (149 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:151:36: F821 undefined name 'cosine_similarity'
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:161:80: E501 line too long (114 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\qa_local_docs.py:162:80: E501 line too long (85 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\run_qa_local_docs.py:33:80: E501 line too long (88 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\run_qa_local_docs.py:36:80: E501 line too long (110 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\run_qa_local_docs.py:44:64: W605 invalid escape sequence '\ '
.\src\llm_utilikit\langchain\rag-with-agents\directoryloader\run_qa_local_docs.py:44:65: W291 trailing whitespace
.\src\llm_utilikit\langchain\rag-with-agents\pdf_only\query_local_docs.py:34:80: E501 line too long (81 > 79 characters)
.\src\llm_utilikit\langchain\rag-with-agents\pdf_only\query_local_docs.py:56:80: E501 line too long (88 > 79 characters)

ChatGPT Context reset + contextual continuity

Prompt =

Request: Create a detailed and informative markdown guide that will serve as a reference for the next AI assistant. This guide should encapsulate the essence of our ongoing project and articulate specific steps taken and future actions required.

Purpose: To provide the next AI assistant with comprehensive context and clear understanding about the project related to web scraping LangChain documentation.

Requirements for the Guide:

Project Overview: Summarize the initial objective - downloading LangChain documentation and the evolution towards developing a web scraping script.
Developed Tools:
Describe the enhanced web scraping script, highlighting its purpose, key features, and user instructions.
Mention the JSON file template, its role in the project, and user responsibilities for its utilization.
Next Steps for the User: Outline specific actions the user must take, including populating the JSON template with URLs and running the script.
Ethical and Legal Considerations: Emphasize the importance of adhering to legal and ethical web scraping practices, including the compliance with robots.txt of target websites.
Monitoring and Troubleshooting: Suggest steps for monitoring the script's output and handling potential issues.
Format: Markdown, for readability and structured documentation.

Goal: To ensure seamless continuity and understanding for the next AI assistant, enabling them to provide effective and pertinent assistance to the user.

Always point to GPT-3.5-Turbo-1106

hard encode into code examples for reproducibility
consider updating versions every snapshot release of GPT-3.5-Turbo-{MMDD}

transcribe_microphone.py

I did not personally find other things worth checking than Assistant Architect did

transcribe_microphone.py

Initialization of ASR Pipeline:

Suggested, not necessary:
The use of the transformers library for the ASR pipeline seems appropriate. However, ensure that the specific model (openai/whisper-large-v2) and the parameters (chunk_length_s, return_timestamps) are supported by the library version you are using.

Audio Processing Logic:

The sliding window concept is a sensible approach for handling real-time audio data. However, there seems to be inconsistency in the way the sliding window is managed after transcription. After the first 30 seconds are transcribed, the remaining part of the window should be retained, not entirely reset.
The sliding window length check (if len(self.sliding_window) >= 16000 * self.asr_pipeline.task.config.chunk_size_ms / 1000:) appears to be incorrectly using chunk_size_ms. You should confirm the existence and correct usage of this attribute in the transformers documentation.

Error Handling:

The script performs error handling and logging for file operations and stream activities, which is good practice.
Consider adding more specific error handling around the ASR pipeline's processing, as this could fail or raise exceptions not currently caught.

Logging and File Writing:

The logic for handling log files could be improved. For example, the check for the log file's existence and writability is repetitive and could be simplified.
The method create_new_log_file does not handle potential exceptions that might occur during file operations.

Resource Management:

Ensure that resources like the PyAudio stream are appropriately closed or released in all scenarios, including exceptions.

run.py

Argument Parsing and Logging:

Argument parsing is correctly implemented.
The logging setup within a file context (with open(...)) is unnecessary, as logging.basicConfig handles file operations internally.

ASR Pipeline and Stream Checks:

The check asr_app.asr_pipeline.is_running() might not be valid. The pipeline object from transformers does not typically have an is_running method. Verify this based on the library's documentation.

Exception Handling:

Good use of try-except blocks to handle unexpected errors and keyboard interrupts.
Consider logging the exception details for better debugging.

Resource Management:

The script ensures that resources are closed in the finally block, which is a good practice.

System prompt

Double check both "prompt" documents for quality and clarity

transcribe_tasks.py

Abstract OpenAI utility

Outsource the API calls

Ensure logging, retries
Follow this todo list

'docs/' subdir needs README

Add descriptive Docstrings to all retained code

langchain, openai, transformers

RAG Prompt/Chain that should be added to AA4LLM's code examples

https://smith.langchain.com/hub/rlm/map-prompt?organizationId=0f7461cf-206f-5c85-aa8d-48c6c48bafc5

RAG Prompt/Chain that should be added to AA4LLM's code examples

Jina embeddings + vector store module

import os
from git import Repo
from langchain.document_loaders.generic import GenericLoader
from langchain.document_loaders.parsers import LanguageParser
from langchain.text_splitter import Language
from langchain.embeddings.jina import JinaEmbeddings
from langchain.vectorstores.chroma import Chroma

def clone_repository(repo_url, repo_path):
    """
    Clones a git repository to the specified path.
    """
    repo = Repo.clone_from(repo_url, to_path=repo_path)
    return repo

def load_code_files(repo_path):
    """
    Loads code files from the specified repository path using LanguageParser.
    """
    loader = GenericLoader.from_filesystem(
        repo_path,
        glob="**/*",
        suffixes=[".py"],
        parser=LanguageParser(language=Language.PYTHON),
    )
    documents = loader.load()
    return documents

def split_documents(documents):
    """
    Splits the documents into chunks using RecursiveCharacterTextSplitter.
    """
    splitter = RecursiveCharacterTextSplitter.from_language(
        language=Language.PYTHON, chunk_size=2000, chunk_overlap=200
    )
    chunks = splitter.split_documents(documents)
    return chunks

def embed_chunks(chunks):
    """
    Embeds the chunks using JinaEmbeddings.
    """
    embeddings = JinaEmbeddings()
    vectorstore = Chroma.from_documents(chunks, embeddings)
    return vectorstore

def save_vectorstore(vectorstore, chromadb_path):
    """
    Saves the vectorstore to ChromaDB.
    """
    vectorstore.save(chromadb_path)

def cleanup_repository(repo_path):
    """
    Cleans up the cloned repository.
    """
    repo = Repo(repo_path)
    repo.close()
    os.remove(repo_path)

def prepare_vector_db(repo_url, repo_path, chromadb_path):
    """
    Prepares a vector database for similarity searching for RAG over code.
    """
    # Clone the repository
    clone_repository(repo_url, repo_path)

    # Load the code files
    documents = load_code_files(repo_path)

    # Split the documents into chunks
    chunks = split_documents(documents)

    # Embed the chunks
    vectorstore = embed_chunks(chunks)

    # Save the vectorstore to ChromaDB
    save_vectorstore(vectorstore, chromadb_path)

    # Clean up the cloned repository
    cleanup_repository(repo_path)

Add Notebook: rag-in-langchain

Repurpose repo

New intention: specifically supply week documented custom wrapper classes in modules for building LLMs
-> LangChain Ecosystem updates (v0.1)

Read all JSON files via Glob

Updates to how output files are named cause errors to throw during the execution of `conv_html_to_markdown.py`.

Solution:

Import glob: At the top of conv_html_to_markdown.py, add import glob to use the glob module for file pattern matching.
Update load_json Function:

Rename it to load_json_files to reflect its new functionality.
Use glob.glob to find all files matching the output-*.json pattern.
Iterate over these files, load their contents, and aggregate the data.

import glob

def load_json_files(pattern):
    """
    Load data from multiple JSON files matching a pattern.

    Args:
        pattern (str): Glob pattern to match files.

    Returns:
        list: Aggregated data from all matched files.
    """
    aggregated_data = []
    for file_path in glob.glob(pattern):
        with open(file_path, "r", encoding="utf-8") as file:
            aggregated_data.extend(json.load(file))
    return aggregated_data

def main():
    # ... existing code ...
    try:
        # Load data from all output JSON files
        original_data = load_json_files("output-*.json")
        # ... rest of the existing code ...

Bing intro

Absolutely, here's a more developer-oriented introduction:

"Welcome to the LLM-Utilikit repository. This toolkit is a collection of prompts and components designed to streamline your work with Large Language Models (LLMs). It's built with developers in mind, providing pre-configured prompts and back-end modules to help you get up and running quickly with OpenAI, LangChain, Hugging Face, or Pinecone. The LLM-Utilikit is open-source, so feel free to contribute and help us improve it. Let's build something amazing together with the power of LLMs."

Subdir: "end2end": ✅

Subdir: "rag-with-agents": unchecked

query_local_docs.py
qa_local_docs.py / run_qa_local_docs.pt

Ensure code works: "OpenAI"

The OpenAI directory doesn't contain much code anymore. Only two files currently available.

daethyra / build-ragai Goto Github PK

build-ragai's Introduction

Hi there 👋 I'm Daethyra (pronounced: duh-thear-uh)

A bit about me...

Tech Stack

Notable Projects

FreeStream

Build-RAGAI

build-ragai's People

Contributors

Stargazers

Watchers

Forkers

build-ragai's Issues

! Contains hallucinated code !

Leftover work: integrable_image_captioner.py:

Code Improvements:

Unused Imports:

Global Variable Usage:

CSV File Handling:

Error Handling in Main:

transcribe_microphone.py

Initialization of ASR Pipeline:

Audio Processing Logic:

Error Handling:

Logging and File Writing:

Resource Management:

run.py

Argument Parsing and Logging:

ASR Pipeline and Stream Checks:

Exception Handling:

Resource Management:

Outsource the API calls

Updates to how output files are named cause errors to throw during the execution of conv_html_to_markdown.py.

Recommend Projects

Recommend Topics

Recommend Org

Updates to how output files are named cause errors to throw during the execution of `conv_html_to_markdown.py`.