Giter Club home page Giter Club logo

citrus's People

Contributors

0xdebabrata avatar anthonyrka avatar sibidine avatar soufrabi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

citrus's Issues

Add sqlite layer

Integrate sqlite to store plain text and other supplied metadata.

Support non-integer IDs

citrus only supports integer IDs at the moment because that's passed directly to HNSW. Add support for non-integer IDs by creating a map between each ID and an incrementing integer value which will serve as the HNSW label.

Support multiple embedding sources

Currently citrus supports vector embeddings from OpenAI text-embedding-ada-002 model.

It'd be great to add support for other embedding models from Cohere, Huggingface, etc. Create an EmbeddingFunction abstract class or something similar and implement OpenAI, Cohere, etc. embedding functions.

Create method to reload index from SQLite index_manager table

Allow reloading HNSW indices from index using data stored in index_manager table.

This would allow citrus to automatically load existing indices from disk during server restarts and take us one step closer to offering a hosted cloud solution.

Reload single index to memory

Currently, all indices need to be in memory to perform CRUD operations on a given index that has been saved to disk and currently not in memory.
This creates unnecessary latency and memory usage as all indices are queried from sqlite before being loaded into memory.
Moreover the get_all_indices method also returns a count of elements in all indices. This query runs very slowly for large indices.

Add a new method that reloads just the required index to memory.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.