gl3nnleblanc / lanterndb Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lanterndata/lantern

0.0 0.0 0.0 164 KB

PostgreSQL vector database extension for building AI applications

Home Page: https://lantern.dev

License: MIT License

Shell 10.39% Python 3.76% C 67.50% Makefile 0.08% CMake 8.76% PLpgSQL 9.41% Dockerfile 0.11%

lanterndb's Introduction

LanternDB 🏮

LanternDB is a relational and vector database, packaged as a Postgres extension. It provides a new index type for vector columns called hnsw which speeds up ORDER BY queries on the table.

Quickstart

Note: Currently LanternDB depends on pgvector for the vector data type. You'll need to manually install pgvector before moving to the next step.

LanternDB builds and uses usearch for its single-header state of the art HNSW implementation.

To build and install LanternDB:

git clone --recursive https://github.com/lanterndata/lanterndb.git
cd lanterndb
mkdir build
cd build
cmake ..
make install
# optionally
# make test

If you have previously cloned LanternDB and would like to update

```bash git pull git submodule update ```

To install on M1 macs, replace cmake .. from the above with cmake -DUSEARCH_NO_MARCH_NATIVE=ON .. to avoid building usearch with unsupported march=native

Using LanternDB

Run the following to enable lanterndb:

CREATE EXTENSION lanterndb;

Then, you can create a table with a vector column and populate it with data.

CREATE TABLE small_world (
    id varchar(3),
    vector vector(3)
);

INSERT INTO small_world (id, vector) VALUES
('000', '[0,0,0]'),
('001', '[0,0,1]'),
('010', '[0,1,0]'),
('011', '[0,1,1]'),
('100', '[1,0,0]'),
('101', '[1,0,1]'),
('110', '[1,1,0]'),
('111', '[1,1,1]');

Then, create an hnsw index on the table.

-- create index with default parameters
CREATE INDEX ON small_world USING hnsw (vector);
-- create index with custom parameters
-- CREATE INDEX ON small_world USING hnsw (vector) WITH (M=2, ef_construction=10, ef=4);

Leverage the index in queries like:

SELECT id, ROUND( (vector <-> '[0,0,0]')::numeric, 2) as dist
FROM small_world
ORDER BY vector <-> '[0,0,0]' LIMIT 5;

A note on index construction parameters

The M, ef, and efConstruction parameters control the tradeoffs of the HNSW algorithm. In general, lower M and efConstruction speed up index creation at the cost of recall. Lower M and ef improve search speed and result in fewer shared buffer hits at the cost of recall. Tuning these parameters will require experimentation for your specific use case. An upcoming LanternDB release will include an optional auto-tuning index.

A note on performnace

LanternDB's hnsw enables search latency similar to pgvector's ivfflat and is faster than ivfflat under certain construction parameters. LanternDB enables higher search throughput on the same hardware since the HNSW algorithm requires fewer distance comparisons than the IVF algorithm, leading to less CPU usage per search.

Roadmap

Recommend Projects

gl3nnleblanc / lanterndb Goto Github PK

lanterndb's Introduction

LanternDB 🏮

Quickstart

Using LanternDB

A note on index construction parameters

A note on performnace

Roadmap

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent