Giter Club home page Giter Club logo

have-i-been-bloomed's Introduction

Have I Been Bloomed?

A simple Bloom filter and server that lets you check user passwords against the Have I Been Pwned 2.0 password database.

The Bloom filter has a size of 1.7 GB with a false positive rate of 1e-6 (i.e. one in one million). You can either directly check the Bloom filter from your code via the Golang or Python libraries, or run a hibb server to check hashed or plaintext passwords.

Installation

To generate the Bloom filter and build/install the server, simple run the Makefile:

make

This will download the password database, unzip it, convert it to a Bloom filter and build the Golang server. You will need about 10.5 GB of space during the creation of the filter (1.7 GB for the filter alone and 8.8 GB for the 7z password file, which you can delete after creating the filter).

Testing

To test your setup with a smaller filter, you can run

make test

This will build a small test filter with only the first 100 entries from the HIBP database. Then, you can run

make run-test

To run a hibb server with the small test filter. Running

curl -i http://localhost:8000/check-sha1?B0399D2029F64D445BD131FFAA399A42D2F8E7DC

should then return a 200 status code.

Server Usage

After installation, the hibb server can be started as follows:

hibb

You may also specify a different file location for the Bloom filter using the -f flag, as well as a different bind address (default: 0.0.0.0:8000) using the -b flag.

The server needs several seconds to load the Bloom filter into memory, as soon as it's up you can query plaintext passwords (not recommended) or UPPERCASE SHA-1 values (preferred) via the /check and /check-sha1 endpoints. Simply pass the value in the query string:

http://localhost:8000/check?admin
# the query below should return 200 with the test filter
http://localhost:8000/check-sha1?B0399D2029F64D445BD131FFAA399A42D2F8E7DC

If the value is in the filter, the server will return a 200 status code, otherwise a 418 (I'm a teapot). The latter is used to be distinguishable from a 404 that you might receive for other reaons (e.g. misconfigured servers).

CLI Usage

You can use the bloom command line tool to check SHA-1 values directly against the filter:

echo "admin" | tr -d "\n" | sha1sum - | tr [a-z] [A-Z] | awk -F" " '{print $1}' | bloom check pwned-passwords-2.0.bloom

Or interactively:

bloom -i check pwned-passwords-2.0.bloom
Interactive mode: Enter a blank line [by pressing ENTER] to exit.
B0399D2029F64D445BD131FFAA399A42D2F8E7DC
>B0399D2029F64D445BD131FFAA399A42D2F8E7DC

Performance

On a Thinkpad 460p, the Golang server manages to process 17.000 requests per second while also generating and processing the requests via ab (Apache Bench). Performance on a "real" server should be even better. The server requires about 1.7 GB of memory (i.e. the size of the Bloom filter).

have-i-been-bloomed's People

Contributors

adewes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

have-i-been-bloomed's Issues

License issues

Hello Andres,

I like your project and would like to use it. However, I saw that you did not add a license. This means that I cannot use the program. If you want other developers to use it you should add a license. Here is a good guide for this:
https://choosealicense.com/
The MIT license would allow me to use and customize your code.

Best regards
Reinhard

Hibb fails when executed: Invalid version bit (should be 1)

Hi! Thanks for this great piece of work - wouldn't be comfortable sending my users' passwords to anyone's API - not even as hashed to Troy's API. So I am trying to contribute by putting this to Docker format and sharing it via Docker hub. But I have noticed two things, one of which I could resolve on my own but the second one persists. When I try to run "hibb" the following error is produced (Docker image is based on Ubuntu 18.04):
`
/root/go/bin/hibb

Loading Bloom filter from pwned-passwords-2.0.bloom...
Invalid version bit (should be 1)
`
Can't really figure this one out because I am not that familiar with go - could you help? This might not be a bug, but at this time, but still... Below is what I could find from cache. Thanks!

BR, Jari

--- SNIP ---

// Read loads a filter from a reader object.
func (s *BloomFilter) Read(input io.Reader) error {GoCover_0_656131393636393837346564.Count[0] = 1;
        bs8 := make([]byte, 8)

        if _, err := io.ReadFull(input, bs8); err != nil {GoCover_0_656131393636393837346564.Count[10] = 1;
                return err
        }

        GoCover_0_656131393636393837346564.Count[1] = 1;flags := binary.LittleEndian.Uint64(bs8)

        if flags&0xFF != 1 {GoCover_0_656131393636393837346564.Count[11] = 1;
                return fmt.Errorf("Invalid version bit (should be 1)")
        }

Make hibb version independent

Problem: hibb uses password file version 2 although version 6 is already available. However, hibb uses 2.0 references, which makes updates difficult.

Solution proposal:

  • introduce one variable that contains a (fixed) URL, which means it is relatively easy to change it. wget should use it.
  • rename the downloaded file to a version independent name e.g. pwned-passwords.txt.7z using "mv" or "wget -O new_file_name http://your_url.com way"
  • replace references containing "pwned-passwords-2.0" with "pwned-passwords"

Hibb server should return JSON response instead of HTTP status code

200 OK should be reserved for connection handling: using status 418 should be removed. This makes it easier for the frontend devs to use the API. Please, make server to return application/json instead of HTTP status code.

If password is found/not found , message body should be {"found": true | false} with HTTP 200 status code and MIME-type application/json.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.