Giter Club home page Giter Club logo

vectordb's Introduction

埃普西拉标志

速度提高 10 倍、成本更低、性能更好的矢量数据库

文档DiscordTwitter博客YouTube反馈


Epsilla 是一个开源矢量数据库。我们的重点是确保矢量搜索的可扩展性、高性能和成本效益。 EpsillaDB 弥合了大型语言模型中信息检索和内存保留之间的差距。

快速开始使用 Docker

1. 在 Docker 中运行后端

docker pull epsilla/vectordb
docker run --pull=always -d -p 8888:8888 -v /data:/data epsilla/vectordb

2. 与Python客户端交互

pip install pyepsilla
from pyepsilla import vectordb

client = vectordb.Client(host='localhost', port='8888') client.load_db(db_name="MyDB", db_path="/data/epsilla") client.use_db(db_name="MyDB")

client.create_table( table_name="MyTable", table_fields=[ {"name": "ID", "dataType": "INT", "primaryKey": True}, {"name": "Doc", "dataType": "STRING"}, ], indices=[ {"name": "Index", "field": "Doc"}, ] )

client.insert( table_name="MyTable", records=[ {"ID": 1, "Doc": "Jupiter is the largest planet in our solar system."}, {"ID": 2, "Doc": "Cheetahs are the fastest land animals, reaching speeds over 60 mph."}, {"ID": 3, "Doc": "Vincent van Gogh painted the famous work "Starry Night.""}, {"ID": 4, "Doc": "The Amazon River is the longest river in the world."}, {"ID": 5, "Doc": "The Moon completes one orbit around Earth every 27 days."}, ], )

client.query( table_name="MyTable", query_text="Celestial bodies and their characteristics", limit=2 )

# Result # { # 'message': 'Query search successfully.', # 'result': [ # {'Doc': 'Jupiter is the largest planet in our solar system.', 'ID': 1}, # {'Doc': 'The Moon completes one orbit around Earth every 27 days.', 'ID': 5} # ], # 'statusCode': 200 # }

特征:

  • 嵌入向量的高性能和生产规模相似性搜索。

  • 成熟的数据库管理系统,具有熟悉的数据库、表和字段概念。矢量只是另一种字段类型。

  • 元数据过滤。

  • 融合密集向量和稀疏向量的混合搜索。

  • 内置嵌入支持,以自然语言呈现自然语言的搜索体验。

  • 具有计算存储分离、无服务器和多租户的云原生架构。

  • 丰富的生态系统集成,包括LangChain和LlamaIndex。

  • Python/JavaScript/Ruby 客户端和 REST API 接口。

Epsilla 的核心采用 C++ 编写,利用先进的学术并行图遍历技术进行向量索引,实现比 HNSW 快 10 倍的向量搜索,同时保持超过 99.9% 的精度水平。

埃普西拉云

在Epsilla Cloud尝试我们完全托管的矢量 DBaaS

(实验)使用 Epsilla 作为 python 库,无需启动 docker 镜像

1. 构建 Epsilla Python Bindings lib 包

cd engine/scripts
(If on Ubuntu, run this first: bash setup-dev.sh)
bash install_oatpp_modules.sh
cd ..
bash build.sh
ls -lh build/*.so

2. 使用上一步中构建的文件夹“build”中的 python bindings lib "epsilla.so" "libvectordb_dylib.so 运行测试

cd engine
export PYTHONPATH=./build/
export DB_PATH=/tmp/db33
python3 test/bindings/python/test.py

以下是一些示例代码:

import epsilla

epsilla.load_db(db_name="db", db_path="/data/epsilla") epsilla.use_db(db_name="db") epsilla.create_table( table_name="MyTable", table_fields=[ {"name": "ID", "dataType": "INT", "primaryKey": True}, {"name": "Doc", "dataType": "STRING"}, {"name": "EmbeddingEuclidean", "dataType": "VECTOR_FLOAT", "dimensions": 4, "metricType": "EUCLIDEAN"} ] ) epsilla.insert( table_name="MyTable", records=[ {"ID": 1, "Doc": "Berlin", "EmbeddingEuclidean": [0.05, 0.61, 0.76, 0.74]}, {"ID": 2, "Doc": "London", "EmbeddingEuclidean": [0.19, 0.81, 0.75, 0.11]}, {"ID": 3, "Doc": "Moscow", "EmbeddingEuclidean": [0.36, 0.55, 0.47, 0.94]} ] ) (code, response) = epsilla.query( table_name="MyTable", query_field="EmbeddingEuclidean", response_fields=["ID", "Doc", "EmbeddingEuclidean"], query_vector=[0.35, 0.55, 0.47, 0.94], filter="ID < 6", limit=10, with_distance=True ) print(code, response)

vectordb's People

Contributors

eric-epsilla avatar richard-epsilla avatar ricki-epsilla avatar topkeyboard avatar tonyyanga avatar yuanzhongqiao avatar andriymulyar avatar juliuslipp avatar jonherke avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.