Giter Club home page Giter Club logo

bromberg_sl2's Introduction

Bromberg-Shpilrain-Vdovina SL₂ Homomorphic Hashing

This is an implementation of the Tillich-Zémor-style hash function presented in the paper "Navigating in the Cayley Graph of SL₂(𝔽ₚ)" by Bromberg, Shpilrain, and Vdovina.

Warning

This module is not produced by cryptography experts, but by some random guy. Furthermore, the algorithm was published in 2017, and is itself not at all battle-tested. Only use this library if you either (a) know what you're doing and have read and understood our code, and/or (b) are building something that does not rely heavily on the cryptographic properties of the hash function.

If you are a cryptography expert, we welcome any bug reports or pull requests! We also welcome them if you're not a cryptography expert; this library is quite simple, and should be easy to grok over a coffee with a copy of the paper linked above in hand.

What is this library for?

This library implements a putatively-strong hash function H with the useful property that it gives a monoid homomorphism. This means there is a cheap operation * such that given strings s1 and s2, H(s1 ++ s2) = H(s1) * H(s2).

This property is especially useful for applications where some very long string may be constructed via many different routes, but you'd nonetheless like to be able to quickly rule out unequal strings.

It also allows you to hash parts of your data as you acquire them, and then merge them later in whatever order is convenient. This allows for very flexible hashing schemes.

H has some other cool properties, and is in some limited but potentially-useful sense "provably secure". See Bromberg et al. for details.

How to use this library

This library provides the means to construct HashMatrixes, using hash(), which takes a slice of bytes. These hashes can be compared, or serialized to hex strings using to_hex.

use bromberg_sl2::*;
assert_eq!(
  hash("hello, world! It's fun to hash stuff!".as_ref()).to_hex(),
  "01c5cf590d32654c87228c0d66441b200aec1439e54e724f05cd3c6c260634e565594b61988933e826e9705de22884ce007df0f733a371516ddd4ac9237f7a46");

Hashes may also be composed, using the * operator:

use bromberg_sl2::*;
assert_eq!(
  hash("hello, ".as_ref()) * hash("world!".as_ref()),
  hash("hello, world!".as_ref())
);

Technical Details

We use the A(2) and B(2) matrices as generators, and p = 2^127 - 1 as our prime order, for fast modular arithmetic.

We have not yet attempted to seriously optimize this library, and performance is a secondary goal. As of right now our procedure is about 1/3 as fast as SHA3-512.

We needed an architecture-agnostic cryptographic hash procedure with a monoid homomorphism respecting string concatenation, written in a low-level language. While there are a few implementations of related algorithms, e.g. the venerable but broken Tillich-Zémor hash, from "Hashing with SL₂" , none of them fulfill these desiderata.

bromberg_sl2's People

Contributors

basile-henry avatar benwr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bromberg_sl2's Issues

Is this alive?

I really need such a hashing function in my project.
Is this safe to use? Can I help somehow in order to make it safe?

Provide FFI bindings?

Not sure how difficult this is to do as an afterthought, though my guess is that it isn't too hard.

Property tests

Compare my results here with results acquired via bignum libraries.

Larger matrix lookup table?

Right now we have a lookup table where we store the hash matrix for each byte. We could also store a hash matrix for each byte pair, which might ~double throughput (at best; in practice you'll have a lot more cache misses). The downside is that you'd need to store 2^16 * 2^6 bytes = 2^20 * 2^2 bytes = 4 MiB. This is just small enough to fit in L3, so the cache miss rate might not be so so bad. My guess is that it's worth attempting and benchmarking.

Another downside is that we want to use this library in a web context, and 4MiB is a lot of weight to add to your binary. You could have two different compilation modes, one with the table calculated on startup (or using a smaller table) and one with the table built in, for cases like this.

no_std support

I don't think we use anything that should preclude no_std support, though I could be wrong. My impression is that it's good practice to provide no_std whenever possible.

Probably everything in here should be constant-time

Right now there are some branches in the modular arithmetic; these give us ~2x speedup I think, but IIUC in general it's bad to have variable running times in crypto libraries. Probably should just kill that stuff.

If I wanted to be super careful about this I would probably also kill the 2-byte storage table? I don't really know. There are two potential problems with it:

  1. Two different tables (2-byte and 1-byte) might make the running time more dependent on parity than it should be. I think this is basically a non-problem, though, because the running time is unavoidably going to vary with the input length.
  2. Bigger tables have more cache misses, and that means more inferable state. I don't know how much this matters in practice but I'd guess it's not totally trivial for legit crypto libraries.

Sadly if I do both of these things I'll wipe out lots and lots of performance gains, which I actually do care about pretty significantly. My current state is something like "just, like, don't use this to hash data that's supposed to be secret, I guess?"

merge const and non-const fns where possible

Right now there's a whole tree of const and non-const fns, because at one point I was trying to use some non-const stuff inside add() or something. But it turns out that was slower than just using the const version. So now there's a bunch of duplicated stuff where it's not necessary. I think the only things that do need to be split are mod_p and matmul.

Maybe try out AFL

I've always wanted to use AFL to find bugs; maybe it will be helpful for this

Consider using somebody else's U256 code

My U256 code is the first thing I thought of doing. i.e. It uses the elementary school multiplication algorithm. Which probably means that it's nowhere near as fast as it could be. Maybe I should look for someone else's implementation.

API cleanup

I think the current API is kind of gross; probably a lot simpler to just have a fn hash(&[u8]) -> HashMatrix

Sanity randomness checks?

I'm not really sure how this should work, but I want some kind of guarantee that the implementation here is actually producing hashes that are random-looking for large-enough inputs. I think maybe some simple statistical checks would help?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.