Giter Club home page Giter Club logo

gohs's Introduction

gohs Continuous integration Go Report Card codecov Apache MIT

Golang binding for Intel's HyperScan regex matching library: hyperscan.io

Hyperscan Go Reference

Hyperscan is a software regular expression matching engine designed with high performance and flexibility in mind. It is implemented as a library that exposes a straightforward C API.

Build

gohs does not enable the latest api of Hyperscan v5.4 by default, if you want to use it please pass build tags hyperscan_v54.

go get -u -tags hyperscan_v54 github.com/flier/gohs/hyperscan

gohs will use Hyperscan v5 API by default, you can also build for Hyperscan v4 with hyperscan_v4 tag.

go get -u -tags hyperscan_v4 github.com/flier/gohs/hyperscan

Chimera Go Reference

Chimera is a software regular expression matching engine that is a hybrid of Hyperscan and PCRE. The design goals of Chimera are to fully support PCRE syntax as well as to take advantage of the high performance nature of Hyperscan.

Build

It is recommended to compile and link Chimera using static libraries.

$ mkdir build && cd build
$ cmake .. -G Ninja -DBUILD_STATIC_LIBS=on
$ ninja && ninja install
$ go get -u -tags chimera github.com/flier/gohs/hyperscan

Note

You need to download the PCRE library source code to build Chimera, see Chimera Requirements for more details

License

This project is licensed under either of Apache License (LICENSE-APACHE) or MIT license (LICENSE-MIT) at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in Futures by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

gohs's People

Contributors

aashah avatar arjenlentz avatar flier avatar jmdacruz avatar liaowei avatar ross-spencer avatar starius avatar tachiniererin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gohs's Issues

gohs in multicore

I ran these benchmarks on a multicore setup (32 cores) with different values for GOMAXPROCS(1,4,32) and the results only roughly the same . Is this expected?

As mentioned in the link the performance should increases linearly with the added cores.

GOMAXPROCS:1

goarch: amd64
pkg: github.com/flier/gohs/bench/go
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
BenchmarkHyperscanBlockScan/Easy0/16             1715931               670.0 ns/op        23.88 MB/s
BenchmarkHyperscanBlockScan/Easy0/32             1748773               686.7 ns/op        46.60 MB/s
BenchmarkHyperscanBlockScan/Easy0/1K             1536878               780.2 ns/op      1312.45 MB/s
BenchmarkHyperscanBlockScan/Easy0/32K             672889              1762 ns/op        18592.67 MB/s
BenchmarkHyperscanBlockScan/Easy0/1M               32414             36524 ns/op        28709.33 MB/s
BenchmarkHyperscanBlockScan/Easy0/32M                704           1745541 ns/op        19222.94 MB/s
BenchmarkHyperscanBlockScan/Easy0i/16            2034398               591.5 ns/op        27.05 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32            1956104               610.5 ns/op        52.41 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1K            1651857               725.8 ns/op      1410.95 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32K            659960              1839 ns/op        17818.02 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1M              30812             38900 ns/op        26955.82 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32M               648           1845531 ns/op        18181.45 MB/s
BenchmarkHyperscanBlockScan/Easy1/16             2009086               592.2 ns/op        27.02 MB/s
BenchmarkHyperscanBlockScan/Easy1/32             1935408               612.1 ns/op        52.28 MB/s
BenchmarkHyperscanBlockScan/Easy1/1K             1572076               766.6 ns/op      1335.74 MB/s
BenchmarkHyperscanBlockScan/Easy1/32K             394917              3078 ns/op        10645.49 MB/s
BenchmarkHyperscanBlockScan/Easy1/1M               16137             74424 ns/op        14089.14 MB/s
BenchmarkHyperscanBlockScan/Easy1/32M                435           2711804 ns/op        12373.47 MB/s
BenchmarkHyperscanBlockScan/Medium/16            2006866               593.9 ns/op        26.94 MB/s
BenchmarkHyperscanBlockScan/Medium/32            1921970               613.1 ns/op        52.19 MB/s
BenchmarkHyperscanBlockScan/Medium/1K            1637011               719.2 ns/op      1423.86 MB/s
BenchmarkHyperscanBlockScan/Medium/32K            682360              1727 ns/op        18976.42 MB/s
BenchmarkHyperscanBlockScan/Medium/1M              34454             34887 ns/op        30056.37 MB/s
BenchmarkHyperscanBlockScan/Medium/32M               667           1721256 ns/op        19494.16 MB/s
BenchmarkHyperscanBlockScan/Hard/16              1996935               596.3 ns/op        26.83 MB/s
BenchmarkHyperscanBlockScan/Hard/32              1935126               612.3 ns/op        52.26 MB/s
BenchmarkHyperscanBlockScan/Hard/1K              1682648               706.6 ns/op      1449.26 MB/s
BenchmarkHyperscanBlockScan/Hard/32K              717792              1721 ns/op        19035.08 MB/s
BenchmarkHyperscanBlockScan/Hard/1M                34600             34801 ns/op        30130.25 MB/s
BenchmarkHyperscanBlockScan/Hard/32M                 697           1733341 ns/op        19358.24 MB/s
BenchmarkHyperscanBlockScan/Hard1/16             1874395               631.0 ns/op        25.36 MB/s
BenchmarkHyperscanBlockScan/Hard1/32             1902772               623.8 ns/op        51.30 MB/s
BenchmarkHyperscanBlockScan/Hard1/1K             1542068               764.3 ns/op      1339.83 MB/s
BenchmarkHyperscanBlockScan/Hard1/32K             258709              4637 ns/op        7067.18 MB/s
BenchmarkHyperscanBlockScan/Hard1/1M                9738            135416 ns/op        7743.36 MB/s
BenchmarkHyperscanBlockScan/Hard1/32M                274           4452196 ns/op        7536.60 MB/s
PASS
ok      github.com/flier/gohs/bench/go  59.318s

GOMAXPROCS:4

goarch: amd64
pkg: github.com/flier/gohs/bench/go
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
BenchmarkHyperscanBlockScan/Easy0/16-4           1970245               607.7 ns/op        26.33 MB/s
BenchmarkHyperscanBlockScan/Easy0/32-4           1855035               643.5 ns/op        49.72 MB/s
BenchmarkHyperscanBlockScan/Easy0/1K-4           1683074               716.0 ns/op      1430.26 MB/s
BenchmarkHyperscanBlockScan/Easy0/32K-4           648412              1718 ns/op        19070.18 MB/s
BenchmarkHyperscanBlockScan/Easy0/1M-4             35168             34605 ns/op        30301.05 MB/s
BenchmarkHyperscanBlockScan/Easy0/32M-4              913           1334215 ns/op        25149.19 MB/s
BenchmarkHyperscanBlockScan/Easy0i/16-4          1962379               587.5 ns/op        27.23 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32-4          1946095               618.0 ns/op        51.78 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1K-4          1699917               714.9 ns/op      1432.33 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32K-4          654991              1799 ns/op        18218.12 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1M-4            31868             37621 ns/op        27872.22 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32M-4             838           1468084 ns/op        22855.94 MB/s
BenchmarkHyperscanBlockScan/Easy1/16-4           1961354               587.8 ns/op        27.22 MB/s
BenchmarkHyperscanBlockScan/Easy1/32-4           1940809               630.1 ns/op        50.79 MB/s
BenchmarkHyperscanBlockScan/Easy1/1K-4           1581129               739.8 ns/op      1384.22 MB/s
BenchmarkHyperscanBlockScan/Easy1/32K-4           403366              3033 ns/op        10802.96 MB/s
BenchmarkHyperscanBlockScan/Easy1/1M-4             16539             72278 ns/op        14507.62 MB/s
BenchmarkHyperscanBlockScan/Easy1/32M-4              474           2531663 ns/op        13253.91 MB/s
BenchmarkHyperscanBlockScan/Medium/16-4          1997407               600.3 ns/op        26.66 MB/s
BenchmarkHyperscanBlockScan/Medium/32-4          1938888               618.4 ns/op        51.74 MB/s
BenchmarkHyperscanBlockScan/Medium/1K-4          1650643               712.5 ns/op      1437.15 MB/s
BenchmarkHyperscanBlockScan/Medium/32K-4          704354              1702 ns/op        19252.22 MB/s
BenchmarkHyperscanBlockScan/Medium/1M-4            35781             33807 ns/op        31016.92 MB/s
BenchmarkHyperscanBlockScan/Medium/32M-4             939           1298811 ns/op        25834.74 MB/s
BenchmarkHyperscanBlockScan/Hard/16-4            2023584               580.0 ns/op        27.58 MB/s
BenchmarkHyperscanBlockScan/Hard/32-4            1868596               631.5 ns/op        50.67 MB/s
BenchmarkHyperscanBlockScan/Hard/1K-4            1669840               692.5 ns/op      1478.64 MB/s
BenchmarkHyperscanBlockScan/Hard/32K-4            698586              1719 ns/op        19064.54 MB/s
BenchmarkHyperscanBlockScan/Hard/1M-4              35702             33578 ns/op        31227.85 MB/s
BenchmarkHyperscanBlockScan/Hard/32M-4               861           1297823 ns/op        25854.39 MB/s
BenchmarkHyperscanBlockScan/Hard1/16-4           1867569               620.4 ns/op        25.79 MB/s
BenchmarkHyperscanBlockScan/Hard1/32-4           1887868               627.3 ns/op        51.01 MB/s
BenchmarkHyperscanBlockScan/Hard1/1K-4           1579549               752.1 ns/op      1361.48 MB/s
BenchmarkHyperscanBlockScan/Hard1/32K-4           261825              4515 ns/op        7257.00 MB/s
BenchmarkHyperscanBlockScan/Hard1/1M-4              9482            126858 ns/op        8265.73 MB/s
BenchmarkHyperscanBlockScan/Hard1/32M-4              292           4136386 ns/op        8112.02 MB/s
PASS
ok      github.com/flier/gohs/bench/go  58.567s

GOMAXPROCS: 32

goarch: amd64
pkg: github.com/flier/gohs/bench/go
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
BenchmarkHyperscanBlockScan/Easy0/16-32                  1956862               614.6 ns/op        26.03 MB/s
BenchmarkHyperscanBlockScan/Easy0/32-32                  1858440               647.1 ns/op        49.45 MB/s
BenchmarkHyperscanBlockScan/Easy0/1K-32                  1661392               724.8 ns/op      1412.78 MB/s
BenchmarkHyperscanBlockScan/Easy0/32K-32                  648123              1725 ns/op        18999.23 MB/s
BenchmarkHyperscanBlockScan/Easy0/1M-32                    34376             34736 ns/op        30186.62 MB/s
BenchmarkHyperscanBlockScan/Easy0/32M-32                     900           1360821 ns/op        24657.50 MB/s
BenchmarkHyperscanBlockScan/Easy0i/16-32                 2070141               575.2 ns/op        27.82 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32-32                 1941291               611.2 ns/op        52.35 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1K-32                 1693740               704.5 ns/op      1453.57 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32K-32                 637003              1807 ns/op        18133.63 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1M-32                   31851             37592 ns/op        27893.45 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32M-32                    867           1559943 ns/op        21510.04 MB/s
BenchmarkHyperscanBlockScan/Easy1/16-32                  2054518               577.5 ns/op        27.71 MB/s
BenchmarkHyperscanBlockScan/Easy1/32-32                  1923642               616.4 ns/op        51.91 MB/s
BenchmarkHyperscanBlockScan/Easy1/1K-32                  1586978               741.1 ns/op      1381.67 MB/s
BenchmarkHyperscanBlockScan/Easy1/32K-32                  397279              2978 ns/op        11003.24 MB/s
BenchmarkHyperscanBlockScan/Easy1/1M-32                    16591             72500 ns/op        14463.10 MB/s
BenchmarkHyperscanBlockScan/Easy1/32M-32                     495           2421523 ns/op        13856.75 MB/s
BenchmarkHyperscanBlockScan/Medium/16-32                 2026708               593.6 ns/op        26.95 MB/s
BenchmarkHyperscanBlockScan/Medium/32-32                 1905799               614.2 ns/op        52.10 MB/s
BenchmarkHyperscanBlockScan/Medium/1K-32                 1653423               712.5 ns/op      1437.25 MB/s
BenchmarkHyperscanBlockScan/Medium/32K-32                 675596              1691 ns/op        19373.48 MB/s
BenchmarkHyperscanBlockScan/Medium/1M-32                   34756             33595 ns/op        31211.97 MB/s
BenchmarkHyperscanBlockScan/Medium/32M-32                    924           1302569 ns/op        25760.19 MB/s
BenchmarkHyperscanBlockScan/Hard/16-32                   1949880               584.0 ns/op        27.40 MB/s
BenchmarkHyperscanBlockScan/Hard/32-32                   1889216               618.6 ns/op        51.73 MB/s
BenchmarkHyperscanBlockScan/Hard/1K-32                   1655174               702.3 ns/op      1458.03 MB/s
BenchmarkHyperscanBlockScan/Hard/32K-32                   669544              1711 ns/op        19150.83 MB/s
BenchmarkHyperscanBlockScan/Hard/1M-32                     35607             33587 ns/op        31219.47 MB/s
BenchmarkHyperscanBlockScan/Hard/32M-32                      860           1366813 ns/op        24549.40 MB/s
BenchmarkHyperscanBlockScan/Hard1/16-32                  1902019               625.6 ns/op        25.57 MB/s
BenchmarkHyperscanBlockScan/Hard1/32-32                  1895744               625.7 ns/op        51.14 MB/s
BenchmarkHyperscanBlockScan/Hard1/1K-32                  1573185               755.7 ns/op      1355.00 MB/s
BenchmarkHyperscanBlockScan/Hard1/32K-32                  260739              4520 ns/op        7249.41 MB/s
BenchmarkHyperscanBlockScan/Hard1/1M-32                     9322            126155 ns/op        8311.83 MB/s
BenchmarkHyperscanBlockScan/Hard1/32M-32                     274           4130479 ns/op        8123.62 MB/s
PASS
ok      github.com/flier/gohs/bench/go  58.398s

Machine details:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  2
Core(s) per socket:  16
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               106
Model name:          Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Stepping:            6
CPU MHz:             2801.963
CPU max MHz:         2800.0000
CPU min MHz:         800.0000
BogoMIPS:            5586.87
Virtualization:      VT-x
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           48K
L1i cache:           32K
L2 cache:            1280K
L3 cache:            49152K
NUMA node0 CPU(s):   0-31
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid aperfmperf pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid arch_capabilities

Scratch space and gorutines

Quick question about scratch space and gorutines: According to the hyperscan documentation (https://intel.github.io/hyperscan/dev-reference/runtime.html#scratch-space), a scratch space should be cloned if it is going to be used by multiple threads. How does this map to gorutines? Is there a way to ensure that the binding clones the scratch space for each underlying thread? The only alternative I see is that the caller creates a channel with a fixed size of the scratch clones, and gorutines use the channel to synchronize

Build issue with hyperscan 5.1.0-1 and gohs v1.1.1

I'm seeing the following error when using gohs v1.1.1, hyperscan 5.1.0-1 and the base golang:1.16.6 docker image:

/go/pkg/mod/github.com/flier/[email protected]/hyperscan/internal_v5.go:41:9: could not determine kind of name for C.hs_compile_lit
/go/pkg/mod/github.com/flier/[email protected]/hyperscan/internal_v5.go:88:9: could not determine kind of name for C.hs_compile_lit_multi

How to reproduce?

Files

  • Dockerfile
FROM golang:1.16.6
RUN apt -yq update && apt -yq install libhyperscan-dev=5.1.0-1 libhyperscan5=5.1.0-1

ADD . /src
WORKDIR /src
RUN go build ./...
  • go.mod (run go mod tidy to generate go.sum)
module example.com

go 1.16

require github.com/flier/gohs v1.1.1
  • main.go
package main

import (
	"github.com/flier/gohs/hyperscan"
)

func main() {
	// simple app to trigger build with gohs and hyperscan
	hyperscan.NewLiteral("something", hyperscan.Caseless)
}

Running

Do:

docker build .

CGO hyperscan crash

Hi @flier,

Many thanks for your work on this wrapper, it has been tremendously helpful as it does all the heavy lifting.

I wrote a tiny convenience wrapper around it, which keeps crashing a few frames deep into hyperscan. I was hoping to get another pair of eyes on it.

This stack trace comes from running this test program.

The wrapper packs your library into a class that contains a pattern database and a scratch, both protected by a write lock during vectored match calls. I cannot reproduce the crash with just once instance of this class, it always happens with multiple instances that are called concurrently.

I spent some time ensuring that pointers crossed the CGO boundary in a sane way and that seems to be the case. I then suspected a bug in hyperscan itself, but failed to reproduce it using a similar C++ wrapper, which is not completely surprising given that there are no GC or M:N goroutine:thread mapping involved. Reasoning futher, binding goroutines to dedicated threads in the Go test program seems to only fractionally help - crashes are more seldom but still occur. At the moment, it would seem that the data pointer is jumbled before crossing into C.

Running with GOMAXPROCS=1 still crashes. Running with GOGC=off seems to help, but I cannot leave the program running for too long without running out of memory.

I have been able to reproduce the crash on FreeBSD and macOS (don't have a Linux system at hand) and hyperscan versions 4.5.2, 4.6 and 4.7.

I would appreciate any thoughts on this, thanks!

I meet a multiple definition of error when use gohs

go version 1.11.4
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root/work"
GOPROXY=""
GORACE=""
GOROOT="/root/go"
GOTMPDIR=""
GOTOOLDIR="/root/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="/usr/local/gcc-5.3.0/bin/gcc"
CXX="/usr/local/gcc-5.3.0/bin/g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build725949828=/tmp/go-build -gno-record-gcc-switches"

ERROR INFO
/root/go/pkg/tool/linux_amd64/link: running /usr/local/gcc-5.3.0/bin/gcc failed: exit status 1
/lib/../lib64/libpthread.a(libpthread.o): In function __libc_sigaction': (.text+0x8590): multiple definition of __libc_sigaction'
/lib/../lib64/libc.a(sigaction.o):(.text+0x20): first defined here
/lib/../lib64/libpthread.a(libpthread.o): In function __connect_nocancel': (.text+0x7969): multiple definition of __connect_nocancel'
/lib/../lib64/libc.a(connect.o):(.text+0x9): first defined here
/lib/../lib64/libpthread.a(libpthread.o): In function __libc_fcntl': (.text+0x77e0): multiple definition of __libc_fcntl'
/lib/../lib64/libc.a(fcntl.o):(.text+0xa0): first defined here
/lib/../lib64/libpthread.a(libpthread.o): In function _IO_funlockfile': (.text+0x8530): multiple definition of _IO_funlockfile'
/lib/../lib64/libc.a(funlockfile.o):(.text+0x0): first defined here
/lib/../lib64/libpthread.a(libpthread.o): In function __libc_nanosleep': (.text+0x7dc0): multiple definition of __libc_nanosleep'
/lib/../lib64/libc.a(nanosleep.o):(.text+0x0): first defined here
/lib/../lib64/libpthread.a(libpthread.o): In function __read': (.text+0x7680): multiple definition of __libc_read'
/lib/../lib64/libc.a(read.o):(.text+0x0): first defined here
/lib/../lib64/libpthread.a(libpthread.o): In function __open_nocancel': (.text+0x7e29): multiple definition of __open_nocancel'
/lib/../lib64/libc.a(open.o):(.text+0x9): first defined here
/lib/../lib64/libpthread.a(libpthread.o): In function __lseek_nocancel': (.text+0x7d09): multiple definition of __lseek_nocancel'
/lib/../lib64/libc.a(llseek.o):(.text+0x9): first defined here
/lib/../lib64/libpthread.a(libpthread.o): In function __read_nocancel': (.text+0x7689): multiple definition of __read_nocancel'
/lib/../lib64/libc.a(read.o):(.text+0x9): first defined here
/lib/../lib64/libpthread.a(libpthread.o): In function `send':

I used hyperscan in my program with two different package. And if i only use hyperscan in one package and everything is ok.

"from" attribute is always zero

Hi,

Hope you are all well !

The evenHandler is not returning the "From" value when I try to match multiple regexes.

Is there any specific parameter to set for having it displayed ?

./test.sh | jq .
{
  "Errno": 0,
  "Msg": "",
  "Data": [
    {
      "Id": 15,
      "From": 0,
      "To": 49,
      "Flags": 0,
      "Context": "[email protected] 1HB5XMLmzFVj8ALj6mfBsbifRoD4miY36v https://twitter.com/x0rxkov random test sentence https://twitter.com/twitter\n\nhttps://github.com/x0rzkov\nhttps://github.com/x0rzkov/cars-dataset\n\n[email protected]:fastai/fastai.git\n\n\n",
      "RegexLinev": {
        "Expr": "[13][a-km-zA-HJ-NP-Z1-9]{25,34}",
        "Data": " BtcAddressPattern     "
      }
    },
    {
      "Id": 28,
      "From": 0,
      "To": 246,
      "Flags": 0,
      "Context": "[email protected] 1HB5XMLmzFVj8ALj6mfBsbifRoD4miY36v https://twitter.com/x0rxkov random test sentence https://twitter.com/twitter\n\nhttps://github.com/x0rzkov\nhttps://github.com/x0rzkov/cars-dataset\n\n[email protected]:fastai/fastai.git\n\n\n",
      "RegexLinev": {
        "Expr": "((git|ssh|http(s)?)|(git@[\\w\\.]+))(:(\\/\\/)?)([\\w\\.@\\:/\\-~]+)(\\.git)(\\/)?",
        "Data": " GitRepoPattern        "
      }
    }
  ]
}

Cheers,
X

bug: blockMatcher/streamMatcher.Match returns false for exact matches

Reproduce steps

blockMatcher:

bdb, _ := NewBlockDatabase(NewPattern(`\d+`, SomLeftMost))

fmt.Println(bdb.Match([]byte("123"))) // should be true, got false

streamMatcher:

sdb, _ := NewStreamDatabase(NewPattern(`\d+`, SomLeftMost))

r := strings.NewReader("123")
fmt.Println(sdb.Match(r)) // should be true, got false

Cause?

The bug should be caused by this part (same thing for streamMatcher) in runtime.go:

func (m *blockMatcher) Handle(id uint, from, to uint64, flags uint, context interface{}) error {
	err := m.matchRecorder.Handle(id, from, to, flags, context)

	if err != nil {
		return err
	}

	if m.n < 0 {
		return nil
	}

	if m.n < len(m.matched) {
		m.matched = m.matched[:m.n]

		return errTooManyMatches
	}

	return nil
}

...

func (m *blockMatcher) Match(data []byte) bool {
	m.n = 1

	err := m.scan(data)

	return err != nil && err.(HsError) == ErrScanTerminated
}

When the data exactly matches the pattern, m.matched won't ever be longer than m.n, thus never telling hyperscan to terminate. That means, the err != nil condition in Match is going to be false.

Is there a reason for the whole error check? Because it seems like directly returning len(m.matched) > 0 is already enough.

go get uses gcc not g++

Creating an empty dummy.cxx file in the hyperscan subdirectory will cause go to use g++ so that linking will work.

use gohs complie time is too long

<\s*?\s*[\s\S]{0,2000}((($_get|$_cookie|$_post|$_request|$_files|$_global)[([\s\S]{1,20})])|(\b(eval|assert|popen|proc_open|shell_exec|system|passthru|call_user_func|phpinfo|md5)\s*([\x{20}-\x{73}]+))|\b(echo|print)[\s(]+[\w-_]+([\x{20}-\x{73}]+))[\s\S]{0,1000}?\s*>
this is the case maybe used:34s, but i use the c code only used:1us
the hyperscan version:5.0.0

there is no way to reset m.handler

func (m *blockMatcher) scan(data []byte) error {
	if err := m.scanner.Scan(data, m.scratch, m.handler.Handle, nil); err != nil {
		return err
	}

	return nil
}

m.handler.matched keeps growing

build simplerep faild

I have install the hyperscan and execute bin/unit-hyperscan success,but when go build my project it failed
my project like this,the hy_scan.go is copied from simplegrep
image
this i s the build result:
image
image

Failed to build with Go 1.8

Go 1.8:

gopath/src/github.com/flier/gohs/hyperscan/internal.go:904: cannot use unsafe.Pointer(db)
 (type unsafe.Pointer) as type *C.struct_hs_database in argument to func literal

Go 1.7.4 works well.

This issue can be related to this part of the release notes:

If cgo is used to call a C function passing a pointer to a C union, and if the C union can contain any pointer values, and if cgo pointer checking is enabled (as it is by default), the union value is now checked for Go pointers.

I am crosscompiling from Linux to Windows.

how to write libhs.pc in windows

hello , i wanna use gohs in windows,but i donot know how to write libhs.pc ,and all demo on internet are for linux,so i cant not use it,can you provide a libhs.pc demo for windows?thank you

How to use gohs on Windows๏ผˆuse Goland to development๏ผ‰?

I tried to compile Hyperscan with Visual Studio 2022 based on the docs and successfully ran bin/unit-hyperscan. The test passed, but GOHS still doesn't seem to be able to find the relevant file, and I tried to search globally and didn't find the file mentioned in the error.

The error is as follows๏ผš
# pkg-config --cflags -- libhs
Package libhs was not found in the pkg-config search path.
Perhaps you should add the directory containing `libhs.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libhs', required by 'virtual:world', not found
pkg-config: exit status 1

Undefined C symbols when system has libhyperscan-4.7.0

Ubuntu has upgraded libhyperscan-dev to version 4.7.0-1 and now gohs/hyperscan doesn't build:

go get -v github.com/flier/gohs/hyperscan
github.com/flier/gohs/hyperscan
# github.com/flier/gohs/hyperscan
../../../../../flier/gohs/hyperscan/internal.go:119:32: could not determine kind of name for C.HS_FLAG_COMBINATION
../../../../../flier/gohs/hyperscan/internal.go:120:32: could not determine kind of name for C.HS_FLAG_QUIET

I updated my copy commenting lines 119, 120, 133 and 134 and now I can build the library, but I am not sure whether it is a right approach.

Incorrect match result on high load

We are using the Hyperscan on the high load system (up to 50k rps)

The version of the gohs lib is 1.1.0
OS Ubuntu 20.04
libhyperscan5 5.2.1-1build1

For matching the patterns we use the VectorDatabase and on the local PC without any load (unit tests) everything works as expected
But on the prod environment we're having an issue - for the equals match we're getting the false result on the completely correct string (double-checked with the simple regexp match) and for some reason, we're getting the true match for the string, that we don't have in any of our patters.

The code is smth like that:

patterns := make([]*hyperscan.Pattern, len(patternStrings))
for i, s := range patternStrings {
	patterns[i] = hyperscan.NewPattern("^somePatternString$", hyperscan.Utf8Mode|hyperscan.SingleMatch)
	patterns[i].Id = i + 1
}
database, err := hyperscan.NewVectoredDatabase(patterns...)
if err != nil {
    return nil, err
}

scratch, err := hyperscan.NewScratch(database)
if err != nil {
    database.Close()
    return nil, err
}

And the scan is like

result := false
err := database.Scan([][]byte{[]byte("somePatternString")}, scratch, func(id uint, from, to uint64, flags uint, context interface{}) error {
	result = true
	return nil
}

And I'm getting false results here (the match handler callback isn't called).
Also, I'm getting the true result for the completely different input string.
And what is more confusing - I'm getting the true result for only one different string. For example: if I have string1, string2, string4, and string5 and pattern equals is created for the string1 I'm getting the true match for string2 and never for any other string

cgo call C.hs_scan performance is much lower than hyperscan in C

I want to analyse time-consuming of hs_scan,then I find there is big gap between cgo and c.
I finish test code depend on "singlegrep" code in examples, use one rule for example string "food" match rule "o{2,}" to run 20w times with test code.This is my result:

func total time time per
func (bs *blockScanner) Scan(...) 1028389 us 5141ns
hs_scan 10633us 53 ns

golang func Scan takes around 100 times longer per time.Is there any mistakes in my test?

1. gohs test code
https://github.com/flier/gohs/blob/master/examples/simplegrep/main.go

package main_test

import (
    _ "bytes"
    "flag"
    "fmt"
    "github.com/flier/gohs/hyperscan"
    "os"
    "testing"
    "time"
)

var (
    flagNoColor    = flag.Bool("C", false, "Disable colorized output.")
    flagByteOffset = flag.Bool("b", false, "The offset in bytes of a matched pattern is displayed in front of the respective matched line")
)

var theme = func(s string) string { return s }

func highlight(s string) string {
    return "\033[35m" + s + "\033[0m"
}

func eventHandler(id uint, from, to uint64, flags uint, context interface{}) error {
    return nil
}

func TestGoHs(t *testing.T) {
    expr := fmt.Sprintf("o{2,}")
    pattern := hyperscan.NewPattern(expr, hyperscan.Caseless|hyperscan.SingleMatch|hyperscan.AllowEmpty)

    database, err := hyperscan.NewBlockDatabase(pattern)
    if err != nil {
        fmt.Printf("ERROR: Unable to compiel pattern \"%s\" : %s\n", pattern.String(), err.Error())
        os.Exit(-1)
    }
    defer database.Close()

    scratch, err := hyperscan.NewScratch(database)
    if err != nil {
        fmt.Fprint(os.Stderr, "ERROR: Unable to allocate scratch space. Exiting.\n")
        os.Exit(-1)
    }

    defer scratch.Free()
    inputData := "food"
    fmt.Printf("Scanning %d bytes with Hyperscan\n", len(inputData))

    if err := database.Scan([]byte(inputData), scratch, eventHandler, inputData); err != nil {
        fmt.Printf("ERROR: Unable to scan input buffer. Exiting.\n")
        os.Exit(-1)
    }
    //็ปŸ่ฎกๆ—ถ้—ด
    t1 := time.Now()
    for i := 0; i < 200000; i++ {
        database.Scan([]byte(inputData), scratch, eventHandler, inputData)
    }
    elapsed := time.Since(t1)
    fmt.Println("App elapsed: ", elapsed)
}

2. c code
https://github.com/intel/hyperscan/blob/master/examples/simplegrep.c

#include <errno.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/time.h>

#include <hs.h>

static int eventHandler(unsigned int id, unsigned long long from,
                        unsigned long long to, unsigned int flags, void *ctx) {
    //printf("Match for pattern");
    return 0;
}

int main(int argc, char *argv[]) {

    char *pattern = "o{2,}";
    hs_database_t *database;
    hs_compile_error_t *compile_err;
    if (hs_compile(pattern, HS_FLAG_DOTALL, HS_MODE_BLOCK, NULL, &database,
                   &compile_err) != HS_SUCCESS) {
        printf("ERROR: Unable to compile pattern \"%s\": %s\n",
                pattern, compile_err->message);
        hs_free_compile_error(compile_err);
        return -1;
    }

    /* Next, we read the input data file into a buffer. */
    unsigned int length = 4;
    char *inputData = "food";

    hs_scratch_t *scratch = NULL;
    if (hs_alloc_scratch(database, &scratch) != HS_SUCCESS) {
        printf("ERROR: Unable to allocate scratch space. Exiting.\n");
        hs_free_database(database);
        return -1;
    }

    printf("Scanning %u bytes with Hyperscan\n", length);

    if (hs_scan(database, inputData, length, 0, scratch, eventHandler,
                pattern) != HS_SUCCESS) {
        printf("ERROR: Unable to scan input buffer. Exiting.\n");
        hs_free_scratch(scratch);
        hs_free_database(database);
        return -1;
    }

    struct timeval start,end;
    gettimeofday( &start, NULL );
    //printf("start %lu s, %lu ns\n", time_start.tv_sec, time_start.tv_nsec);

    for (int i = 1; i <= 200000; i++) {
        hs_scan(database, inputData, length, 0, scratch, eventHandler,pattern);
    }

    gettimeofday( &end, NULL );
    long timeuse = (end.tv_usec - start.tv_usec) + 1000000 * (end.tv_sec-start.tv_sec);
    //printf("end %lu s, %lu ns\n", time_end.tv_sec, time_end.tv_nsec);

    printf("total time run %ld us \n", timeuse);
    printf("per time run %ld ns \n", timeuse*1000/200000);

    hs_free_scratch(scratch);
    hs_free_database(database);
    return 0;
}

Package libhs was not found in the pkg-config search path.

pkg-config --cflags libhs --static

Package libhs was not found in the pkg-config search path.
Perhaps you should add the directory containing `libhs.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libhs' found
pkg-config: exit status 1

Add support for Logical Combinations

First off, thanks for making a great library.

The official hyperscan documentation describes logical combinations of patterns. This functionality doesn't seen to be supported by gohs at the moment, are there any plans to add it?

I think the change should be pretty simple:

diff --git a/hyperscan/internal.go b/hyperscan/internal.go                                                            
index de4a68c..1c42531 100644
--- a/hyperscan/internal.go
+++ b/hyperscan/internal.go
@@ -116,6 +116,8 @@ const (
        UnicodeProperty CompileFlag = C.HS_FLAG_UCP          // Enable Unicode property support for this expression.
        PrefilterMode   CompileFlag = C.HS_FLAG_PREFILTER    // Enable prefiltering mode for this expression.
        SomLeftMost     CompileFlag = C.HS_FLAG_SOM_LEFTMOST // Enable leftmost start of match reporting.
+       Combination     CompileFlag = C.HS_FLAG_COMBINATION  // Enable logical combination of patterns
+       Quiet           CompileFlag = C.HS_FLAG_QUIET        // Enable quiet at matching
 )
 
 var compileFlags = map[rune]CompileFlag{
@@ -128,6 +130,8 @@ var compileFlags = map[rune]CompileFlag{
        'p': UnicodeProperty,
        'f': PrefilterMode,
        'l': SomLeftMost,
+       'C': Combination,
+       'Q': Quiet,
 }

(flags as defined here).

example for the following regexes

Hi,

Hope you are all well !

I was wondering if you could provide me an example how to regex these patterns with gohs.

	bitcoinPatternRegexp, err := regexp.Compile(`[13][a-km-zA-HJ-NP-Z0-9]{26,33}$`)
	if err != nil {
		log.Fatal(err)
	}

	emailPatternRegexp, err := regexp.Compile(`([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$`)
	if err != nil {
		log.Fatal(err)
	}

	// (?:https?://)?(?:www)?(\S*?\.onion)\b
	onionPatternRegexp, err := regexp.Compile(`(?:https?\:\/\/)?[\w\-\.]+\.onion`)
	if err != nil {
		log.Fatal(err)
	}

	twitterPatternRegexp, err := regexp.Compile(`(https?\:)?(//)(www[\.])?(twitter.com/)([a-zA-Z0-9_]{1,15})[\/]?`)
	if err != nil {
		log.Fatal(err)
	}

we are working on an open source tor crawler and we start to have lots of regexes to implement. We could not figure out how to do it with gohs, sorry for that.

Can you gives us a snippet ?

Thanks in advance for any insights or inputs on that topic.

Cheers,
X

runtime error with go 1.14.2

I'm trying to update my code to use go 1.14.2. When I run my tests for code using gohs I'm getting a runtime error:

Any idea what could cause this?

runtime: pointer 0xc000433860 to unallocated span span.base()=0xc00042c000 span.limit=0xc000434000 span.state=0
fatal error: found bad pointer in Go heap (incorrect use of unsafe or cgo?)

goroutine 76 [running, locked to thread]:
runtime.throw(0x20b2be3, 0x3e)
	/usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc00090d3a0 sp=0xc00090d370 pc=0x44cb52
runtime.badPointer(0x7fccb647c2b0, 0xc000433860, 0x0, 0x0)
	/usr/local/go/src/runtime/mbitmap.go:380 +0x230 fp=0xc00090d3e8 sp=0xc00090d3a0 pc=0x42cae0
runtime.findObject(0xc000433860, 0x0, 0x0, 0x1000, 0xc000040300, 0x14)
	/usr/local/go/src/runtime/mbitmap.go:416 +0x9b fp=0xc00090d420 sp=0xc00090d3e8 pc=0x42cb8b
runtime.checkptrBase(0xc000433860, 0xc0003fdf94)
	/usr/local/go/src/runtime/checkptr.go:68 +0x4f fp=0xc00090d460 sp=0xc00090d420 pc=0x41e07f
runtime.checkptrAlignment(0xc000433860, 0x1ee6bc0, 0x1)
	/usr/local/go/src/runtime/checkptr.go:19 +0x75 fp=0xc00090d490 sp=0xc00090d460 pc=0x41ded5
github.com/flier/gohs/hyperscan.hsMatchEventCallback(0x1, 0x0, 0x8, 0x3ff1f7c100000000, 0xc000433860, 0x32668b4)
	/swarm/gopath/pkg/mod/github.com/flier/[email protected]/hyperscan/internal.go:910 +0x52 fp=0xc00090d4e8 sp=0xc00090d490 pc=0x19a22a2
github.com/flier/gohs/hyperscan._cgoexpwrap_16a4143cf05d_hsMatchEventCallback(0x1, 0x0, 0x8, 0x7fcc00000000, 0xc000433860, 0x7fcceaa51534)
	_cgo_gotypes.go:946 +0x5d fp=0xc00090d530 sp=0xc00090d4e8 pc=0x199f50d
runtime.call64(0x0, 0x7ffc0bf6b970, 0x7ffc0bf6ba00, 0x30)
	/usr/local/go/src/runtime/asm_amd64.s:540 +0x3b fp=0xc00090d580 sp=0xc00090d530 pc=0x47fc0b
runtime.cgocallbackg1(0x0)
	/usr/local/go/src/runtime/cgocall.go:332 +0x1ac fp=0xc00090d618 sp=0xc00090d580 pc=0x41a77c
runtime.cgocallbackg(0x0)
	/usr/local/go/src/runtime/cgocall.go:207 +0xc1 fp=0xc00090d680 sp=0xc00090d618 pc=0x41a531
runtime.cgocallback_gofunc(0x41a3de, 0x1ce4eb0, 0xc00090d710, 0xc00090d700)
	/usr/local/go/src/runtime/asm_amd64.s:793 +0x9b fp=0xc00090d6a0 sp=0xc00090d680 pc=0x48115b
runtime.asmcgocall(0x1ce4eb0, 0xc00090d710)
	/usr/local/go/src/runtime/asm_amd64.s:640 +0x42 fp=0xc00090d6a8 sp=0xc00090d6a0 pc=0x480ff2
runtime.cgocall(0x1ce4eb0, 0xc00090d710, 0xc0003fdfe8)
	/usr/local/go/src/runtime/cgocall.go:143 +0x9e fp=0xc00090d6e0 sp=0xc00090d6a8 pc=0x41a3de
github.com/flier/gohs/hyperscan._Cfunc_hs_scan_vector_cgo(0x6107dc0, 0xc0003fdfe8, 0xc00042a000, 0x1, 0x6153dc0, 0xc000433860, 0x0)
	_cgo_gotypes.go:662 +0x6b fp=0xc00090d710 sp=0xc00090d6e0 pc=0x199e96b
github.com/flier/gohs/hyperscan.hsScanVector.func1(0x6107dc0, 0xc000433800, 0xc00042a000, 0x1, 0x1, 0xc000739160, 0x1, 0x1, 0x0, 0x6153dc0, ...)
	/swarm/gopath/pkg/mod/github.com/flier/[email protected]/hyperscan/internal.go:959 +0x1a4 fp=0xc00090d770 sp=0xc00090d710 pc=0x19a4934
github.com/flier/gohs/hyperscan.hsScanVector(0x6107dc0, 0xc000739160, 0x1, 0x1, 0x0, 0x6153dc0, 0xc0003df0f0, 0x0, 0x0, 0x10000c000433940, ...)
	/swarm/gopath/pkg/mod/github.com/flier/[email protected]/hyperscan/internal.go:959 +0x24b fp=0xc00090d888 sp=0xc00090d770 pc=0x19a273b
github.com/flier/gohs/hyperscan.(*vectoredScanner).Scan(0xc000633c48, 0xc000739160, 0x1, 0x1, 0xc000633c68, 0xc0003df0f0, 0x0, 0x0, 0x0, 0x0)
	/swarm/gopath/pkg/mod/github.com/flier/[email protected]/hyperscan/runtime.go:276 +0x133 fp=0xc00090d960 sp=0xc00090d888 pc=0x1999013
github.com/flier/gohs/hyperscan.(*vectoredDatabase).Scan(0xc000633c58, 0xc000739160, 0x1, 0x1, 0xc000633c68, 0xc0003df0f0, 0x0, 0x0, 0xc000433ac0, 0x7ae51a)

Missing mapping for using ExprExt in hs_compile_ext_multi()

According to https://intel.github.io/hyperscan/dev-reference/compilation.html#extended-parameters, hs_compile_ext_multi supports an additional parameter hs_expr_ext_t, which adds some additional flags to the expression.

The struct has been mapped as ExprExt in this library, and the internal implementation already used this struct. However db.Compile doesn't support passing ExprExt as an argument. If this method could be exposed in this wrapper it would be greatly appreciated.

Leaky hyperscan.Database

This can easily be triggered when the pattern database needs updating throughout the lifetime of the program. At a quick glance it would seem that the Close() call does not resolve to the appropriate private concrete type.

package main

import (
	"fmt"
    "time"
    "net/http"
    _ "net/http/pprof"

	"github.com/flier/gohs/hyperscan"
)

func main() {
    var patterns = generatePatterns(1000)
    var database hyperscan.VectoredDatabase
    var firstRun = true 

    go func() {
        http.ListenAndServe("localhost:6060", nil)
    }()

    var compiledPatterns, err = compilePatterns(patterns)
    if err != nil {
        fmt.Printf("%s\n", err.Error())
        return
    }

    for {
        if !firstRun {
            database.Close()
            firstRun = false
        }

        database, err = buildDatabase(compiledPatterns)
        if err != nil {
            fmt.Printf("%s\n", err.Error())
            return
        }

        time.Sleep(100 * time.Millisecond)
    }
}

func buildDatabase(patterns []*hyperscan.Pattern) (hyperscan.VectoredDatabase, error) {
	var builder = &hyperscan.DatabaseBuilder{
		Patterns: patterns,
		Mode:     hyperscan.VectoredMode,
		Platform: hyperscan.PopulatePlatform(),
	}
	var db, err = builder.Build()
	if err != nil {
		return nil, fmt.Errorf("error updating pattern database: %s", err.Error())
	}

	return db.(hyperscan.VectoredDatabase), nil
}

func compilePatterns(patterns []string) ([]*hyperscan.Pattern, error) {
	var compiledPatterns = make([]*hyperscan.Pattern, len(patterns))
	
	for idx, pattern := range patterns {
		var compiledPattern, compileErr = hyperscan.ParsePattern(pattern)
		switch compileErr {
		case nil:
			compiledPattern.Id = idx
			compiledPatterns[idx] = compiledPattern
		default:
			return nil, compileErr
		}
	}

	return compiledPatterns, nil
}

func generatePatterns(howMany int) []string {
    var ret []string

    for i := 0; i < howMany; i++ {
        ret = append(ret, fmt.Sprintf("/mypattern%d/i", i))
    }

    return ret
}

How to use it by right method?

My demand is to match one string to multi regex. And I write code below, I get a Database and what shall I do next?I don't find right method to call in Database or I missed something?

package main

import "github.com/flier/gohs/hyperscan"

func main(){
    regs := []hyperscan.Expression{
        hyperscan.Expression("..."),
        hyperscan.Expression("..."),
    }
    db := new(hyperscan.DatabaseBuilder)
    db.AddExpressions(regs...).Build()
    // what shall I do next? is there some method like db.Match("target string") or something?
}

Could not match empty buffer even on `HS_FLAG_ALLOWEMPTY` enforced

Hyperscan allows HS_FLAG_ALLOWEMPTY. As the doc says:
(see https://intel.github.io/hyperscan/dev-reference/api_files.html)

HS_FLAG_ALLOWEMPTY
Compile flag: Allow expressions that can match against empty buffers.

This flag instructs the compiler to allow expressions that can match against empty buffers, such as .?, .*, (a|). Since Hyperscan can return every possible match for an expression, such expressions generally execute very slowly; the default behaviour is to return an error when an attempt to compile one is made. Using this flag will force the compiler to allow such an expression.

In hyperscan test shows hyperscan supports this behavior.

But because of this code:

func hsScan(db hsDatabase, data []byte, flags ScanFlag, scratch hsScratch, onEvent hsMatchEventHandler, context interface{}) error {
	if len(data) == 0 {
		return HsError(C.HS_INVALID)
	}

the flag will not work in the gohs project.

Get regexp index from match

Hello there. Thank you for nice library. I'm playing with your examples now, but cannot find, how to get index of matched regex from database. For my use case it would be perfect (I'm uploading huge table of regular expressions into block databse and want to get only matched regexp index to return url by index from second table).

cgo preprocessing failed with Hyperscan library on Linux amd64

I'm encountering a build error with the github.com/flier/gohs package on Linux amd64 when using Go with cgo and the Hyperscan library. Despite properly configuring the PKG_CONFIG_PATH, CGO_CFLAGS, and CGO_LDFLAGS environment variables, I'm still getting a cgo preprocessing error during compilation.

Error message:

/go/pkg/mod/github.com/flier/gohs@v1.2.1/internal/hs/allocator.go:42:8: could not import C (cgo preprocessing failed) (compile)

Steps to reproduce:

  1. Install the Hyperscan library and its development package.
  2. Set the PKG_CONFIG_PATH, CGO_CFLAGS, and CGO_LDFLAGS environment variables based on the installed Hyperscan library.
  3. Run rake lint on a project that depends on the github.com/flier/gohs package.

Environment:

  • OS: Ubuntu 18.04
  • Go version: go version go1.19.6 linux/amd64
  • Hyperscan version: 4.7.0

Troubleshooting steps taken:

  1. Verified the installation of Hyperscan and its development package.
  2. Checked the paths to header and library files using pkg-config.
  3. Set the CGO_CFLAGS and CGO_LDFLAGS environment variables based on pkg-config output.
  4. Cleaned the Go cache using go clean -modcache.

Despite following these steps, the issue persists. Any help or guidance would be greatly appreciated.

link with libchimera fail, need help

Hello, flier

something wrong when i want to build with chimera, i would be very happy if you could help me.

here is my environment:

# Boost: 1.81.0
# Hyperscan: 5.4.1
# PCRE: 8.45

go get -u -tags hyperscan_v54 github.com/flier/gohs/hyperscan
ln -s /home/ubuntu/boost/boost_1_81_0/boost /home/ubuntu/hyperscan/hyperscan/include/boost
tar xf pcrev8.45.tar.gz -C /home/ubuntu/hyperscan/hyperscan/pcre --strip-components=1
./bin/unit-chimera
[----------] Global test environment tear-down
[==========] 262 tests from 6 test cases ran. (8298 ms total)
[  PASSED  ] 262 tests.

when i run comman go build, i got errors below:

PKG_CONFIG_PATH=/usr/local/lib/pkgconfig go build -tags chimera main.go
# command-line-arguments
/usr/lib/go/pkg/tool/linux_amd64/link: running g++ failed: exit status 1
/usr/bin/ld: /usr/local/lib/libchimera.a(ch_compile.cpp.o): in function `ch::ch_compile_multi_int(char const* const*, unsigned int const*, unsigned int const*, unsigned int, unsigned int, unsigned long, unsigned long, hs_platform_info const*, ch_database**)':
ch_compile.cpp:(.text+0x534): undefined reference to `ue2::mmbit_size(unsigned int)'
/usr/bin/ld: /usr/local/lib/libchimera.a(ch_compile.cpp.o): in function `ch::ch_compile_multi_int(char const* const*, unsigned int const*, unsigned int const*, unsigned int, unsigned int, unsigned long, unsigned long, hs_platform_info const*, ch_database**) [clone .cold]':
ch_compile.cpp:(.text.unlikely+0xd5): undefined reference to `ue2::CompileError::CompileError(unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0xdc): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0xe3): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x133): undefined reference to `ue2::CompileError::CompileError(unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x13a): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x141): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x19e): undefined reference to `ue2::CompileError::CompileError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x1b7): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x1be): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x1f1): undefined reference to `ue2::CompileError::CompileError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x20a): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x211): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x26f): undefined reference to `ue2::CompileError::CompileError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x288): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x28f): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x2c2): undefined reference to `ue2::CompileError::CompileError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x2db): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x2e2): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x354): undefined reference to `ue2::CompileError::CompileError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x36d): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x374): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x3a7): undefined reference to `ue2::CompileError::CompileError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x3c0): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x3c7): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x404): undefined reference to `ue2::CompileError::CompileError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x41d): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x424): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x4ae): undefined reference to `ue2::CompileError::CompileError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x4c7): undefined reference to `ue2::CompileError::~CompileError()'
/usr/bin/ld: ch_compile.cpp:(.text.unlikely+0x4ce): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: /usr/local/lib/libchimera.a(ch_compile.cpp.o):(.data.rel.local.DW.ref._ZTIN3ue212CompileErrorE[DW.ref._ZTIN3ue212CompileErrorE]+0x0): undefined reference to `typeinfo for ue2::CompileError'
/usr/bin/ld: /usr/local/lib/libchimera.a(ch_runtime.c.o): in function `multiCallback':
ch_runtime.c:(.text+0x6cc): undefined reference to `mmbit_maxlevel_direct_lut'
/usr/bin/ld: ch_runtime.c:(.text+0x6dc): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0x740): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0xa14): undefined reference to `mmbit_keyshift_lut'
/usr/bin/ld: ch_runtime.c:(.text+0xa25): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0xab2): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0xac4): undefined reference to `mmbit_maxlevel_direct_lut'
/usr/bin/ld: ch_runtime.c:(.text+0xb2a): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0xb62): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0xba3): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0xbdd): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0xc13): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: /usr/local/lib/libchimera.a(ch_runtime.c.o):ch_runtime.c:(.text+0xc44): more undefined references to `mmbit_root_offset_from_level' follow
/usr/bin/ld: /usr/local/lib/libchimera.a(ch_runtime.c.o): in function `multiCallback':
ch_runtime.c:(.text+0xc7c): undefined reference to `mmbit_keyshift_lut'
/usr/bin/ld: ch_runtime.c:(.text+0xc87): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0x1068): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0x10e6): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0x1138): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0x118a): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: /usr/local/lib/libchimera.a(ch_runtime.c.o):ch_runtime.c:(.text+0x11dc): more undefined references to `mmbit_root_offset_from_level' follow
/usr/bin/ld: /usr/local/lib/libchimera.a(ch_runtime.c.o): in function `ch_scan':
ch_runtime.c:(.text+0x1635): undefined reference to `mmbit_maxlevel_direct_lut'
/usr/bin/ld: ch_runtime.c:(.text+0x195b): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0x19ae): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0x1a02): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0x1a56): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: ch_runtime.c:(.text+0x1aaa): undefined reference to `mmbit_root_offset_from_level'
/usr/bin/ld: /usr/local/lib/libchimera.a(ch_runtime.c.o):ch_runtime.c:(.text+0x1af2): more undefined references to `mmbit_root_offset_from_level' follow
collect2: error: ld returned 1 exit status

i also tried below, they are also failed, seems like it't not about build flags?

PKG_CONFIG_PATH=/usr/local/lib/pkgconfig go build -tags chimera,hyperscan_v54  main.go

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.