Giter Club home page Giter Club logo

hnswlib-rs's People

Contributors

bwsw avatar jean-pierreboth avatar niebon avatar pegesund avatar ruqqq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

hnswlib-rs's Issues

Request to support wasm32 target

Few dependencies are blocking the build to wasm32-unknown-unknown target

mmap-rs and cpu-time

I also tried removing them with depending code (as I don't need the IO feature currently), but getting some other issues like below :

Caused by: Failed while trying to install all canisters.
Caused by: Failed to install wasm module to canister 'elna_db_backend'.
Caused by: Failed during wasm installation call
Caused by: The replica returned a rejection error: reject code CanisterError, reject message Error from Canister bkyz2-fmaaa-aaaaa-qaaaq-cai: Canister's Wasm module is not valid: Wasm module has an invalid import section. Module imports function '__wbg_crypto_566d7465cdbb6b7a' from '__wbindgen_placeholder__' that is not exported by the runtime..
This is likely an error with the compiler/CDK toolchain being used to build the canister. Please report the error to IC devs on the forum: https://forum.dfinity.org and include which language/CDK was used to create the canister., error code None

Relevant error from above - Module imports function '__wbg_crypto_566d7465cdbb6b7a' from '__wbindgen_placeholder__' that is not exported by the runtime..

Please consider adding wasm32-unknown-unknown support

delete and update

First of all, this project is affirmed, I would like to ask how to perform delete and update operations

parallel_insert panic

i get the panic msg:Panic occurred: PanicInfo { payload: Any { .. }, message: Some(assertion failed: c.dist_to_ref <= 0.), location: Location { file: "/home/xxx/.cargo/registry/src/mirrors.ustc.edu.cn-61ef6e0cd06fb9b8/hnsw_rs-0.1.19/src/hnsw.rs", line: 879, col: 13 }, can_unwind: true }

load_hnsw in HnswIo panics when loading a hnsw that contains no points

let space = Hnsw::<f64, DistL1>::new(
    max_nb_connection,
    max_elements,
    max_layer,
    ef_c,
    DistL1 {},
);
space.file_dump(&file_path);
let mut io = HnswIo::new(PathBuf::from("."), file_path);
io.set_options(ReloadOptions::new(true));
let mmap_space = io.load_hnsw::<f64, DistL1>().unwrap();

How to load existing hnsw index bin file ??

Hi,

I have hnsw index bin file written by some other python process. I want to load it in rust using this crate but I am not able to find any such functionality. Please help !!

Change borrow to move semantics for HnswIo index loading

Hello, now when you try to load the database from the disk, the code borrows structures from HnswIo. It produces a very uncomfortable use condition, either requiring a singleton or, as I recently did - a thread or async with a loop, so the borrow checker can ensure the correctness. Both variants are very constraining.

I propose changing this behavior to "move" behavior when Hnsw consumes HnswIo rather than borrows its properties. As a result, its use will be significantly simplified. However, I still do not understand the role of this borrowing design because, from my perspective, HnswIo is just a loader, not a holder of loaded data.

If particular data must be used by the Hnsw index, it needs to be moved there. We can use various approaches like:

enum Data {
   DataWithoutSideEffect(bla),
   DataWithSideEffect(blabla)
}

improvement: make HnswIo Sync so it can be shared between threads

right now I get this error

std::cell::RefCell<usize>` cannot be shared between threads safely
within `(std::string::String, hnsw_rs::hnswio::HnswIo)`, the trait `std::marker::Sync` is not implemented for `std::cell::RefCell<usize>`
if you want to do aliasing and mutation between multiple threads, use `std::sync::RwLock` instead
required because it appears within the type `(String, HnswIo)

HNSW for biology

Dear Jean,

This is Jianshu, a bioinformatics phd student at Georgia Tech. I am writing to you to ask your interest to form a new collaboration. Specifically, applying HNSW into genome classification problems, that is to find the closest genome in a big genome database to see the close related ones in the database so that enivronmental microbiologist can tell what taxonomy the query genome is. This will have a very big impact on the field and will definitely have a lot of citations. I want to completely rely on rust for this project without using any other language considering the advantages compared to C++ and python. I am not an expert in rust but have been using it for 2 years. The biology needed and taxonomy related information will be my strong part. I also know all the classification software in this field and have benign using/modifying them for my master and half of my Ph.D I am confident that HNSW in rust will greatly improved the speed of genome classifiers. Hopefully we can come up with a paper in the end. My email is [email protected]

Let me know if you are interest and if something I mentioned above is not clear.

Many thanks,

Jianshu

Integrate Graph Reordering into HNSW

Coleman et al. showed that reordering the nodes in every layer such that the neighbors of each node are laid out closer in memory improves query time performance by about 40%. The idea is that reordering provides a cache-efficient search mechanism that reduces the search overhead due to random accesses in HNSW.

They also showed that using hierarchy is not strictly necessary in certain settings. They replaced the hierarchy with "a process where we randomly sample 50 nodes and use the closest option as the initialization." They observed no statistically significant difference between the hierarchical search procedure and this random sampling process in terms of recall or query time over 10k items.

I can work on integrating these features.

Generating a C .h binding

I see that with Julia this isn't necessary as the calls are all dynamic, however, I'm trying to integrate the library with Swift and it requires a header file. To that end I was attempting to use cbindgen to create one but it fails, I think, because of the use of macros to define some of the types:

WARN: Skip hnsw_rs::NB_LAYER_MAX - (not `pub`).
WARN: Skip hnsw_rs::M_MIN - (not `pub`).
WARN: Skip hnsw_rs::MAGICPOINT - (not `pub`).
WARN: Skip hnsw_rs::MAGICDESCR_1 - (not `pub`).
WARN: Skip hnsw_rs::MAGICDESCR_2 - (not `pub`).
WARN: Skip hnsw_rs::MAGICLAYER - (not `pub`).
WARN: Skip hnsw_rs::MAGICDATAP - (not `pub`).
WARN: Can't find HnswApif32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApif32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApiu16. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApif32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApif32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApif32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApif32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApii32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApii32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApiu32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApiu32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApiu32. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApiu16. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApiu16. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApiu16. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApiu8. This usually means that this type was incompatible or not found.
WARN: Can't find HnswApiu8. This usually means that this type was incompatible or not found.

What do you suggest?

SIMD-accelerated Levenshtein distance

Hello, are you interested in a fast edit distance implementation? I am working on a library here that has Levenshtein distance and Hamming distance functions for ASCII byte strings (u8). If you are interested, then I can work on an implementation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.