The vectordb from yuanzhongqiao

速度提高 10 倍、成本更低、性能更好的矢量数据库

文档• Discord • Twitter • 博客• YouTube • 反馈

Epsilla 是一个开源矢量数据库。我们的重点是确保矢量搜索的可扩展性、高性能和成本效益。 EpsillaDB 弥合了大型语言模型中信息检索和内存保留之间的差距。

快速开始使用 Docker

1. 在 Docker 中运行后端

docker pull epsilla/vectordb
docker run --pull=always -d -p 8888:8888 -v /data:/data epsilla/vectordb

2. 与Python客户端交互

pip install pyepsilla

from pyepsilla import vectordb
client = vectordb.Client(host='localhost', port='8888')
client.load_db(db_name="MyDB", db_path="/data/epsilla")
client.use_db(db_name="MyDB")
client.create_table(
table_name="MyTable",
table_fields=[
{"name": "ID", "dataType": "INT", "primaryKey": True},
{"name": "Doc", "dataType": "STRING"},
],
indices=[
{"name": "Index", "field": "Doc"},
]
)
client.insert(
table_name="MyTable",
records=[
{"ID": 1, "Doc": "Jupiter is the largest planet in our solar system."},
{"ID": 2, "Doc": "Cheetahs are the fastest land animals, reaching speeds over 60 mph."},
{"ID": 3, "Doc": "Vincent van Gogh painted the famous work "Starry Night.""},
{"ID": 4, "Doc": "The Amazon River is the longest river in the world."},
{"ID": 5, "Doc": "The Moon completes one orbit around Earth every 27 days."},
],
)
client.query(
table_name="MyTable",
query_text="Celestial bodies and their characteristics",
limit=2
)
# Result
# {
#     'message': 'Query search successfully.',
#     'result': [
#         {'Doc': 'Jupiter is the largest planet in our solar system.', 'ID': 1},
#         {'Doc': 'The Moon completes one orbit around Earth every 27 days.', 'ID': 5}
#     ],
#     'statusCode': 200
# }

特征：

嵌入向量的高性能和生产规模相似性搜索。
成熟的数据库管理系统，具有熟悉的数据库、表和字段概念。矢量只是另一种字段类型。
元数据过滤。
融合密集向量和稀疏向量的混合搜索。
内置嵌入支持，以自然语言呈现自然语言的搜索体验。
具有计算存储分离、无服务器和多租户的云原生架构。
丰富的生态系统集成，包括LangChain和LlamaIndex。
Python/JavaScript/Ruby 客户端和 REST API 接口。

Epsilla 的核心采用 C++ 编写，利用先进的学术并行图遍历技术进行向量索引，实现比 HNSW 快 10 倍的向量搜索，同时保持超过 99.9% 的精度水平。

埃普西拉云

在Epsilla Cloud尝试我们完全托管的矢量 DBaaS

（实验）使用 Epsilla 作为 python 库，无需启动 docker 镜像

1. 构建 Epsilla Python Bindings lib 包

cd engine/scripts
(If on Ubuntu, run this first: bash setup-dev.sh)
bash install_oatpp_modules.sh
cd ..
bash build.sh
ls -lh build/*.so

2. 使用上一步中构建的文件夹“build”中的 python bindings lib "epsilla.so" "libvectordb_dylib.so 运行测试

cd engine
export PYTHONPATH=./build/
export DB_PATH=/tmp/db33
python3 test/bindings/python/test.py

以下是一些示例代码：

import epsilla
epsilla.load_db(db_name="db", db_path="/data/epsilla")
epsilla.use_db(db_name="db")
epsilla.create_table(
table_name="MyTable",
table_fields=[
{"name": "ID", "dataType": "INT", "primaryKey": True},
{"name": "Doc", "dataType": "STRING"},
{"name": "EmbeddingEuclidean", "dataType": "VECTOR_FLOAT", "dimensions": 4, "metricType": "EUCLIDEAN"}
]
)
epsilla.insert(
table_name="MyTable",
records=[
{"ID": 1, "Doc": "Berlin", "EmbeddingEuclidean": [0.05, 0.61, 0.76, 0.74]},
{"ID": 2, "Doc": "London", "EmbeddingEuclidean": [0.19, 0.81, 0.75, 0.11]},
{"ID": 3, "Doc": "Moscow", "EmbeddingEuclidean": [0.36, 0.55, 0.47, 0.94]}
]
)
(code, response) = epsilla.query(
table_name="MyTable",
query_field="EmbeddingEuclidean",
response_fields=["ID", "Doc", "EmbeddingEuclidean"],
query_vector=[0.35, 0.55, 0.47, 0.94],
filter="ID < 6",
limit=10,
with_distance=True
)
print(code, response)

yuanzhongqiao / vectordb Goto Github PK

vectordb's Introduction

快速开始使用 Docker

特征：

埃普西拉云

（实验）使用 Epsilla 作为 python 库，无需启动 docker 镜像

vectordb's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent