dgryski / go-farm Goto Github PK
View Code? Open in Web Editor NEWgo-farm: a pure-Go farmhash implementation
License: MIT License
go-farm: a pure-Go farmhash implementation
License: MIT License
When passing the same byte arrays to fingerprint64 in java and in go I'm getting different hashes.
This java code results in the hash 0a77beeabd04e23c
import com.google.common.hash.HashCode;
import com.google.common.hash.HashFunction;
import com.google.common.hash.Hashing;
public class Fingerprint64Example {
public static void main(String[] args) {
byte[] bytes = {102, 111, 111, 102, 105, 108, 101, 110, 97, 109, 101, 61, 0, 116, 104, 101, 47, 102, 105, 108, 101, 44, 0, 59, 0};
HashFunction hasher = Hashing.farmHashFingerprint64();
HashCode hashCode = hasher.hashBytes(bytes);
System.out.println(hashCode);
}
}
This go code results in the hash 3ce204bdeabe770a
package main
import (
"fmt"
farm "github.com/dgryski/go-farm"
)
func main() {
bytes := []byte{102, 111, 111, 102, 105, 108, 101, 110, 97, 109, 101, 61, 0, 116, 104, 101, 47, 102, 105, 108, 101, 44, 0, 59, 0}
fmt.Printf("%x", farm.Fingerprint64(bytes))
}
Is this expected? Is there something I've overlooked?
The function hash32Len0to4 is returning different values than the reference implementation.
I can see the following difference:
https://github.com/google/farmhash/blob/master/src/farmhash.cc#L1055
https://github.com/dgryski/go-farm/blob/master/farmhashcc.go#L30
In the original implementation v is a signed char and here is a uint32.
The function seems exploiting the overflow behaviour but I am not sure how to correctly reproduce this in golang.
This repo is missing License information.
It is a dependency of https://github.com/dgraph-io/badger which is Apache Licensed and which I would like to use. It would be great if you could add the Apache 2.0 license, or another open source license.
The pure-C++ version of Hash64 appears to now be based on the farmhashxo set of functions rather than the farmhashna set. This change probably happened with 1.1 and we didn't notice because we were only checking for compatibility with the fingerprint code.
uber/cadence
uses Fingerprint32
extensively, as did ringpop-go
. It seems likely then that they would benefit from a faster assembly version.
Hey there!
I belong to an open source security research community, and a member (@akincibor) has found an issue, but doesn’t know the best way to disclose it.
If not a hassle, might you kindly add a SECURITY.md
file with an email, or another contact method? GitHub recommends this best practice to ensure security issues are responsibly disclosed, and it would serve as a simple instruction for security researchers in the future.
Thank you for your consideration, and I look forward to hearing from you!
(cc @huntr-helper)
# github.com/dgryski/go-farm
../../dgryski/go-farm/fp_amd64.s:455: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:457: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:462: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:464: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:502: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:504: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:509: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:511: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:516: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:518: unrecognized instruction "IMUL3L"
../../dgryski/go-farm/fp_amd64.s:557: unrecognized instruction "IMUL3L"
asm: too many errors
See https://travis-ci.org/dgraph-io/badger/jobs/520608511 for more information
While running a bank test on Badger, I narrowed down a failure on farm.Fingerprint method being used for conflict detection among keys. Produced this PR: dgraph-io/badger#593
A long-running test (over 4h at this point) is starting to look like this is indeed the fix.
Way to test:
badger bank test --dir=<some-dir> --dur=4h
(with and without this PR)
The txn conflict detection code is here: https://github.com/dgraph-io/badger/blob/master/transaction.go#L147-L171
Notes from observation:
Comparing @ts=437427 with @ts=437428
Index: 92881. Account [{Id:92881 Bal:120}] -> [{Id:92881 Bal:115}]
Index: 94432. Account [{Id:94432 Bal:115}] -> [{Id:94432 Bal:120}]
2018/09/26 09:49:12 Balance did NOT match up. Expected: 10000000. Received: 9999995
Index: 57525. Account [{Id:57525 Bal:115}] -> [{Id:57525 Bal:120}]
Index: 94432. Account [{Id:94432 Bal:120}] -> [{Id:94432 Bal:110}]
So, both read $115. @ts=437428, 94332 acct went from 115 -> 120.
@ts=437443, 94332 acct went from the read 115 -> 110 (but the actual was 120).
This should have been a txn conflict. It seems like farm.Fingerprint might have
generated a non-deterministic hash.
If helpful, I could produce a main program which directly load tests farm.Fingerprint method over a million keys (similar to what Badger bank test is doing).
vendor/github.com/dgryski/go-farm/fp64_stub.go:7:6: Fingerprint64 redeclared in
previous declaration at vendor/github.com/dgryski/go-farm/fp64_generic.g
vendor/github.com/dgryski/go-farm/fp64_stub.go:7:6: missing function body
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.