Supabase example:
- https://www.youtube.com/watch?v=ibzlEQmgPPY&t=1s
- https://github.com/supabase-community/chatgpt-your-files
- https://supabase.com/docs/guides/ai (also based on pgvector and Postgres)
PgVector-Python:
- https://github.com/pgvector/pgvector-python
- https://github.com/supabase/vecs/blob/main/src/vecs/collection.py
- https://tembo.io/blog/vector-indexes-in-pgvector/ (pgvector indexes)
- ✅ Timescale: PostgreSQL as a Vector Database: Create, Store, and Query OpenAI Embeddings With pgvector
- https://github.com/timescale/vector-cookbook
- https://supabase.com/docs/guides/ai/vector-columns?database-method=sql
- https://www.crunchydata.com/blog/topic/ai
- https://github.com/CrunchyData/Postgres-AI-Tutorial
- https://www.crunchydata.com/blog/whats-postgres-got-to-do-with-ai
- https://www.crunchydata.com/blog/pgvector-performance-for-developers
- https://www.crunchydata.com/blog/scaling-vector-data-with-postgres
- ✅ psycopg is the successor to psycopg2
- HNSW index:
- AWS RDS Postgres support:
- https://aws.amazon.com/about-aws/whats-new/2023/10/amazon-rds-postgresql-pgvector-hnsw-indexing/
- https://aws.amazon.com/about-aws/whats-new/2023/05/amazon-rds-postgresql-pgvector-ml-model-integration/
- ✅PostgreSQL version 13.13 on Amazon RDS
- https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_UpgradeDBInstance.PostgreSQL.html
- ✅⭐ https://aws.amazon.com/blogs/database/accelerate-hnsw-indexing-and-searching-with-pgvector-on-amazon-aurora-postgresql-compatible-edition-and-amazon-rds-for-postgresql/
- https://aws.amazon.com/blogs/machine-learning/text-embedding-and-sentence-similarity-retrieval-at-scale-with-amazon-sagemaker-jumpstart/
OpenAI Cookbooks on Vector Databases:
Hybrid Search:
- ✅https://weaviate.io/blog/hybrid-search-explained
- https://github.com/pgvector/pgvector#hybrid-search
ChromaDB:
- AWS deployment: https://docs.trychroma.com/deployment#simple-aws-deployment
- https://colab.research.google.com/drive/181Kummxd8yOyRqFu8I0aqjs2aqnOy4Fu?usp=sharing#scrollTo=6lfVmRQlepiI
Vector DB Comparison:
RAG:
- https://www.youtube.com/watch?v=wBhY-7B2jdY
- https://www.canva.com/design/DAFw0D8y038/5Yh9MA2XXd2Lfr2thcsuLA/edit
- ⭐https://towardsdatascience.com/advanced-retrieval-augmented-generation-from-theory-to-llamaindex-implementation-4de1464a9930
- ⭐https://howaibuildthis.substack.com/archive?sort=new
- ⭐https://towardsdatascience.com/12-rag-pain-points-and-proposed-solutions-43709939a28c
- https://blog.llamaindex.ai/introducing-llamacloud-and-llamaparse-af8cedf9006b
- https://martinfowler.com/articles/engineering-practices-llm.html
- https://medium.com/towards-data-science/retrieval-augmented-generation-rag-from-theory-to-langchain-implementation-4e9bd5f6a4f2
- SQL Database: https://vanna.ai/docs/
Chunking (Token size):
- https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken
- https://python.langchain.com/docs/modules/data_connection/document_transformers/#text-splitters
- ✅ Pinecone: Chunking Strategies for LLM Applications
- https://python.langchain.com/docs/modules/data_connection/document_transformers/
RAG Evaluation:
- https://denyslazarenko.github.io/2024/01/14/rag_pipeline.html
- https://towardsdatascience.com/evaluating-rag-applications-with-ragas-81d67b0ee31a
Pydantic for LLM: