Giter Club home page Giter Club logo

fastbloomfilter's Introduction

Libraries.io SourceRank pypi downloads lint_python Upload Python Package CodeQL GitHub issues GitHub forks GitHub stars GitHub license

Simple and fast pythonic bloomfilter

From wikipedia: "A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not โ€“ in other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed (though this can be addressed with a "counting" filter); the more elements that are added to the set, the larger the probability of false positives."

This filter supports:

- Saving, reloading with pickle. 
- Stats
- Entropy analysis
- Internal and external hashing of data.
- raw filter merging

Installing:

sudo pip install fastbloomfilter

External creation of the bloom filter file:

python mkbloom.py /tmp/filter.blf

Importing:

>>> from fastBloomFilter import bloom
>>> bf = bloom.BloomFilter(array_size=1024**3)

Or

>>> from fastBloomFilter import bloom
>>> bf = bloom.BloomFilter(filename='/tmp/filter.blf')

Adding data to it:

>>> bf.add('30000')
>>> bf.add('1230213')
>>> bf.add('1')

Printing stats:

>>> bf.stat()

Or:

>>> bf.info()

Querying data:

>>> print(bf.query('1'))
True
>>> print(bf.query('1230213'))
True
>>> print(bf.query('12'))
False
>>> print(bf['1'])
True

Querying data and at the same time adding it:

>>> print(bf.update('1'))
False 
# False means the object wasn't existing and was added.
>>> print(bf.update('1')) 
True  
# True means the object existed and nothing new was added.
>>> print(bf.update('2'))
False
>>> print(bf.update('2'))
True

Merging two filters:

Create first filter:

>>> from fastBloomFilter import bloom
>>> bf1 = bloom.BloomFilter(array_size=1024**3)
>>> bf1.add("1")

Create second filter:

>>> from fastBloomFilter import bloom
>>> bf2 = bloom.BloomFilter(array_size=1024**3)
>>> bf2.add("2")

Merge the two filters into a third filter:

>>> bf3 = bf1 + bf2

Check the elements in the third filter:

>>> print(bf3["1"])
True
>>> print(bf3["2"])
True

Contributing

Contributons:
    Are welcome!
    Criteria: - They should not include hidden folders or files of any ide environment.
              - They should not delete big portions of the project.
              - They should not include files that does not have anything to do with the project.
              - They should not change the API. (API changes should be proposed with Issues as enhancements)
              - They should not include any obfuscated code.
              - They should not include binaries.
              - They should be in small PRs for faster reviewing process.
              - They should include a small testcase.
              - Any contribution not hornoring this criteria will be rejected until it does.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.