Giter Club home page Giter Club logo

vectordb.js's Introduction

VectorDB.js

VectorDB.js — Simple in-memory vector database for Node.js

GitHub Repo stars NPM Downloads GitHub code size in bytes GitHub License

VectorDB.js is a simple in-memory vector database for Node.js. It's an easy way to do text similarity.

  • Works 100% locally and in-memory by default
  • Uses hnswlib-node for simple vector search
  • Uses Embeddings.js for simple text embeddings
  • Supports OpenAI, Mistral and local embeddings
  • Caches embeddings
  • Automatically resizes database size
  • Store objects with embeddings
  • MIT license

Install

Install VectorDB.js from NPM:

npm install @themaximalist/vectordb.js

For local embeddings, install the transformers library:

npm install @xenova/transformers

For remote embeddings like OpenAI and Mistral, add an API key to your environment.

export OPENAI_API_KEY=...
export MISTRAL_API_KEY=...

Usage

To find similar strings, add a few to the database, and then search.

import VectorDB from "@themaximalist/vectordb.js"
const db = new VectorDB();

await db.add("orange");
await db.add("blue");

const result = await db.search("light orange");
// [ { input: 'orange', distance: 0.3109036684036255 } ]

Embedding Models

By default VectorDB.js uses a local embeddings model.

To switch to another model like OpenAI, pass the service to the embeddings config.

const db = new VectorDB({
  dimensions: 1536,
  embeddings: {
    service: "openai"
  }
});

await db.add("orange");
await db.add("blue");
await db.add("green");
await db.add("purple");

// ask for up to 4 embeddings back, default is 3
const results = await db.search("light orange", 4);
assert(results.length === 4);
assert(results[0].input === "orange");

With Mistral Embeddings:

const db = new VectorDB({
  dimensions: 1024,
  embeddings: {
    service: "mistral"
  }
});

// ...

Being able to easily switch embeddings providers ensures you don't get locked in!

VectorDB.js was built on top of Embeddings.js, and passes the full embeddings config option to Embeddings.js.

Custom Objects

VectorDB.js can store any valid JavaScript object along with the embedding.

const db = new VectorDB();

await db.add("orange", "oranges");
await db.add("blue", ["sky", "water"]);
await db.add("green", { "grass": "lawn" });
await db.add("purple", { "flowers": 214 });

const results = await db.search("light green", 1);
assert(results[0].object.grass == "lawn");

This makes it easy to store metadata about the embedding, like an object id, URL, etc...

API

The VectorDB.js library offers a simple API for using vector databases. To get started, initialize the VectorDB class with a config object.

new VectorDB({
  dimensions: 384, // Default: 384. The dimensionality of the embeddings.
  size: 100,       // Default: 100. Initial size of the database; automatically resizes
  embeddings: {
    service: "openai" // Configuration for the embeddings service.
  }
});

Options

  • dimensions <int>: Size of the embeddings. Default is 384.
  • size <int>: Initial size of the database, will automatically resize. Default is 100.
  • embeddings <object>: Embeddings.js config options
    • service <string>: The service for generating embeddings, transformer, openai or mistral

Methods

async add(input=<string>, obj=<object>)

Adds a new text string to the database, with an optional JavaScript object.

await vectordb.add("Hello World", { dbid: 1234 });

async search(input=<string>, num=<int>, threshold=<float>)

Search the vector database for a string input, no more than num and only if the distance is under threshold.

// 5 results closer than 0.5 distance
await vectordb.search("Hello", 5, 0.5);

resize(size=<number>)

Resizes the database to specific size, handled automatically but can be set explicitly.

vectordb.resize(size);

Response

VectorDB.js returns results from vectordb.search() as a simple array of objects that follow this format:

  • input <string>: Text string match
  • distance <float>: Similarity to search string, lower distance means more similar.
  • object <object>: Optional object returned if attached
[
  {
      input: "Red"
      distance: 0.54321,
      object: {
      	dbid: 123
      }
  }
]

Debug

VectorDB.js uses the debug npm module with the vectordb.js namespace.

View debug logs by setting the DEBUG environment variable.

> DEBUG=vectordb.js*
> node src/run_vector_search.js
# debug logs

The VectorDB.js API aims to make it simple to do text similarity in Node.js—without getting locked into an expensive cloud provider or embedding model.

Deploy

VectorDB.js works great by itself, but was built side-by-side to work with Model Deployer.

Model Deployer is an easy way to deploy your LLM and Embedding models in production. You can monitor usage, rate limit users, generate API keys with specific settings and more.

It's especially helpful in offering options to your users. They can download and run models locally, they can use your API, or they can provide their own API key.

It works out of the box with VectorDB.js.

const db = new VectorDB({
  embeddings: {
    service: "modeldeployer",
    model: "api-key",
  }
});

await db.add("orange", "oranges");
await db.add("blue", ["sky", "water"]);
await db.add("green", { "grass": "lawn" });
await db.add("purple", { "flowers": 214 });

const results = await db.search("light green", 1);
assert(results[0].object.grass == "lawn");

Learn more about deploying embeddings with Model Deployer.

Projects

VectorDB.js is currently used in the following projects:

License

MIT

Author

Created by The Maximalist, see our open-source projects.

vectordb.js's People

Contributors

themaximalist avatar

Stargazers

jeunjetta avatar Stefan Ciprian Hotoleanu avatar Klein avatar  avatar Efkan avatar No.96 avatar

Watchers

 avatar

vectordb.js's Issues

issue 1st 🎉

I can't believe this repo still doesn't have an issue. You've made it great!

I wonder how I can get all the records from DB.

If I pass an empty string with the purpose of getting all documents from the db, it throws the error of Error: input must be a string.

The piece of code I execute:

image

Save and Load from Disk

I'm very excited to use this in my project, but I have one question.

I am trying to implement long term memory in an AI chatbot, so I need the ability to save the embeddings to disk and then load them back later.

Is there a way I can do this right now without any code changes, or make with some light monkey patching?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.