Giter Club home page Giter Club logo

bsuccinct-rs's People

Contributors

beling avatar kore-signet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

bsuccinct-rs's Issues

License for bsuccinct-rs

hi! nice work on these crates -- I'd like to use csf in my (commercial) project for building perfect hash tables in index files.

would you be open to licensing this repo with an MIT license?

Building `fmph::keyset::DynamicKeySet` from a `rayon::iter::ParallelIterator` instead of `Iterator`

Hi,

I'm trying to build a DynamicKeySet (actually, a CachedKeySet) from a rayon::iter::ParallelIterator instead of an Iterator. I feel this makes sense given I have a bunch of iterators as my own input, and fmph internally uses them as well.

I attempted to do this by copying the DynamicKeySet structure, and crudely replacing every : Iterator bound with a : ParallelIterator. This works for the most part; however I am stuck implementing for_each_key and into_vec as their predicate lack the Send and Sync bounds necessary to call rayon's .map().

And adding these bounds to the KeySet definition seems excessive, given that not all users need them.

Any tips?

Thanks!

Performance with large datasets

We're trying to build an MPH for a dataset of 25B strings in the format swh:1:cnt:8cff7880b408c38231b723b4c568367912a33cb2 (about 1TB of data) from the Software Heritage graph. Previously we were using GOVMinimalPerfectHashFunction from Sux4J, which would construct the map in about 5h. However, after 15h we had to kill the ph process as it was still unfinished. Can you guide us into the best parameters/way of calling the lib to build such an MHP? Or is it such a long construction time typical?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.