Giter Club home page Giter Club logo

owulveryck / onnx-go Goto Github PK

View Code? Open in Web Editor NEW
644.0 17.0 66.0 51.96 MB

onnx-go gives the ability to import a pre-trained neural network within Go without being linked to a framework or library.

Home Page: https://blog.owulveryck.info/2019/04/03/from-a-project-to-a-product-the-state-of-onnx-go.html

License: MIT License

Makefile 0.02% Go 99.96% PureBasic 0.01% Shell 0.03%
go onnx neural-network open-source gorgonia software2 machine-learning protobuf

onnx-go's Introduction

ONNX Logo Go Logo

Mentioned in Awesome Go GoDoc Go Report Card Build Status CodeCov

This is a Go Interface to Open Neural Network Exchange (ONNX).

Overview

onnx-go contains primitives to decode a onnx binary model into a computation backend, and use it like any other library in your go code. for more information about onnx, please visit onnx.ai.

The implementation of the the spec of ONNX is partial on the import, and non-existent for the export.

Vision statement

For the Go developer who needs to add a machine learning capability to his/her code, onnx-go is a package that facilitates the use of neural network models (software 2.0) and unlike any other computation library, this package does not require special skills in data-science.

Warning The API is experimental and may change.

Disclaimer

This is a new version of the API.
The tweaked version of Gorgonia have been removed. It is now compatible with the master branch of Gorgonia.
Some operators are not yet available though.

A utility has been added in order to run models from the zoo.
check the `examples` subdirectory.

Install

Install it via go get

go get github.com/owulveryck/onnx-go

onnx-go is compatible with go modules.

Example

Those examples assumes that you have a pre-trained model.onnx file available. You can download pre-trained modles from the onnx model zoo.

Very simple example

This example does nothing but decoding the graph into a simple backend. Then you can do whatever you want with the generated graph.

// Create a backend receiver
	backend := simple.NewSimpleGraph()
	// Create a model and set the execution backend
	model := onnx.NewModel(backend)

	// read the onnx model
	b, _ := ioutil.ReadFile("model.onnx")
	// Decode it into the model
	err := model.UnmarshalBinary(b)

Simple example to run a pre-trained model

This example uses Gorgonia as a backend.

import "github.com/owulveryck/onnx-go/backend/x/gorgonnx"

At the present time, Gorgonia does not implement all the operators of ONNX. Therefore, most of the model from the model zoo will not work. Things will go better little by little by adding more operators to the backend.

You can find a list of tested examples and a coverage here.

func Example_gorgonia() {
	// Create a backend receiver
	backend := gorgonnx.NewGraph()
	// Create a model and set the execution backend
	model := onnx.NewModel(backend)

	// read the onnx model
	b, _ := ioutil.ReadFile("model.onnx")
	// Decode it into the model
	err := model.UnmarshalBinary(b)
	if err != nil {
		log.Fatal(err)
	}
	// Set the first input, the number depends of the model
	model.SetInput(0, input)
	err = backend.Run()
	if err != nil {
		log.Fatal(err)
	}
	// Check error
	output, _ := model.GetOutputTensors()
	// write the first output to stdout
	fmt.Println(output[0])
}

Model zoo

In the examples subdirectory, you will find a utility to run a model from the zoo, as well as a sample utility to analyze a picture with Tiny YOLO v2

Internal

ONNX protobuf definition

The protobuf definition of onnx has is compiled into Go with the classic protoc tool. The definition can be found in the internal directory. The definition is not exposed to avoid external dependencies to this repo. Indeed, the pb code can change to use a more efficient compiler such as gogo protobuf and this change should be transparent to the user of this package.

Execution backend

In order to execute the neural network, you need a backend able to execute a computation graph (for more information on computation graphs, please read this blog post

This picture represents the mechanism:

Schema

onnx-go do not provide any executable backend, but for a reference, a simple backend that builds an information graph is provided as an example (see the simple subpackage). Gorgonia is the main target backend of ONNX-Go.

Backend implementation

a backend is basically a Weighted directed graph that can apply on Operation on its nodes. It should fulfill this interface:

type Backend interface {
	OperationCarrier
	graph.DirectedWeightedBuilder
}
type OperationCarrier interface {
	// ApplyOperation on the graph nodes
	// graph.Node is an array because it allows to handle multiple output
	// for example a split operation returns n nodes...
	ApplyOperation(Operation, ...graph.Node) error
}

An Operation is represented by its name and a map of attributes. For example the Convolution operator as described in the spec of onnx will be represented like this:

convOperator := Operation{
		Name: "Conv",
		Attributes: map[string]interface{}{
			"auto_pad":  "NOTSET",
			"dilations": []int64{1, 1},
			"group":     1,
			"pads":      []int64{1, 1},
			"strides":   []int64{1, 1},
		},
	}

Besides, operators, a node can carry a value. Values are described as tensor.Tensor To carry data, a Node of the graph should fulfill this interface:

type DataCarrier interface {
	SetTensor(t tensor.Tensor) error
	GetTensor() tensor.Tensor
}

Backend testing

onnx-go provides a some utilities to test a backend. Visit the testbackend package for more info.

Contributing

Contributions are welcome. A contribution guide will be eventually written. Meanwhile, you can raise an issue or send a PR. You can also contact me via Twitter or on the gophers' slack (I am @owulveryck on both)

This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

Author

Olivier Wulveryck

License

MIT.

onnx-go's People

Contributors

an0rak-dev avatar arriven avatar bezineb5 avatar blackrez avatar chaoyueziji avatar coip avatar diegobernardes avatar fodil-a avatar iver avatar linkerlin avatar loeffel-io avatar mattn avatar owulveryck avatar po3rin avatar rlespinasse avatar testwill avatar thazelart avatar vovapi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

onnx-go's Issues

Implement operator `LinearClassifier` for backend `Gorgonia`

Why is this operator needed?

I get this error when trying to run a backend from a model created using sciket_learn Python module (sklearn.linear_model.LogisticRegression).

onnx: operator LinearClassifier not implemented ()

Implementation

Link to existing material on the backend

N/A

Expected problems?

N/A

Tests

go test -run=ONNX/TestOperator

Gorgonia as a clean backend

For the POC, Gorgonia has been heavily tweaked to fit the gonum's GraphBuilder interface.

The tweaked version has been vendored as an example. This leads to a maintenance problem.
It's merely impossible to add new features to the backend.

The Gorgonia team is working on a significant refactor of the API.
Meanwhile, I am considering rewriting the Gorgonia backend to use the actual API.

My idea is to use something similar to what I did with gorgonnx some time ago.

We can create a simple GorgoniaGraph in the backend fulfilling the interfaces:

  • gonum.WeightedGraphBuilder
  • OperationCarrier
  • and issuing nodes compatible with TensorCarrier

and then a method

func (g *GorgoniaGraph) ToExprGraph() (*gorgonia.ExprGraph, error)

That would turn the graph into a gorgonia exprgraph by walking the graph starting from the input and creating the node accordingly.

Example. Assume an equation y = a * x + b that would produce this graph:

y -> add
add -> mul
add -> b
mul -> a
mul -> x

properly encoded into a Weighted Graph.

The ToExprGraph would then walk the graph (still have to figure out how to efficiently walk the graph) and create:

a := gorgonia.NodeFromAny(aT) // aT is already a tensor.Tensor
x := gorgonia.NodeFromAny(xT) // ...
// ...
m := gorgonia.Must(gorgonia.Mul(a,x))
add := gorgonia.Must(gorgonia.Add(m. b)

Do you get the drill?

What do you think?

[backend] [gorgonnx] init function of the operator is not used

The operator interface of the backend gorgonnx is mentioning an init function:

type operator interface {
        // apply analyze the graph to find the children of the node
        // then extract its gorgonia.Node references
        // and assign the result of the operation to the node n
        apply(g *Graph, n *Node) error
        // init the operator with name and attributes as carried by the onnx.Operator
        init(o onnx.Operation) error
}

This function is not used at the present time; this can lead to a panic test_structure.go:89: runtime error: invalid memory address or nil pointer dereference and unexpected behavior.

Implement operator Identity for backend Gorgonia/Gorgonnx

Why is this operator needed?

To run the fizzbuzz as generated by this code

Implementation

Link to existing material on the backend

N/A

Expected problems?

N/A

Tests

go test -run=ONNX/Identity

go test -run=ONNX/Identity -v
=== RUN   TestONNX
=== RUN   TestONNX/TestIdentity
--- PASS: TestONNX (0.01s)
    --- SKIP: TestONNX/TestIdentity (0.00s)
        test_structure.go:118: onnx: operator Identity not implemented ()
PASS
ok      github.com/owulveryck/onnx-go/backend/x/gorgonnx        0.040s

Gorgonia's evaluation of the MNIST model does not give expected result

This is related to the issue #2 I have with gorgonnx (the previous test implementation of onnx-to-gorgonia).

The problem is exactly the same with the new version of the unmarsaler (from the directed-graph branch)

To investigate, I will check every operator to see where the bug is hidden.

To do so, I have created a test file here.
This file contains the evaluated input and output of all the node that compose the MNIST model (from the ONNX model zoo).

The next task is to evaluate all the tests to see if the results are ok.

To help me, each test function generates a "numpy" compatible tensor for input and output.
For simple operators, that should be enough to run them within python and to compute the expected result.

Any help welcome.

HOWTO:

  • go-get this repository
  • checkout the directed-graph branch
  • cd into examples/gorgonia
  • go run mnist.go run the (unsuccessful) test (gorgonia has been vendorer in this directory)
  • go test generate a numpy subdirectory with the tests files.
  • find which operator is not ok

Remark: I did not export the attributes of the Convolution operator yet, but you can find their values in the internal/examples/mnist directory

Tests fail in alpine

Trying to test onnx-go with alpine go 1.11 (in container) and this happens :

/go/src/github.com/owulveryck/onnx-go # ./go.test.sh 
# runtime/race
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::GetArgv()':
gotsan.cc:(.text+0x4183): undefined reference to `__libc_stack_end'
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::ReExec()':
gotsan.cc:(.text+0x9797): undefined reference to `__libc_stack_end'
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::InternalAlloc(unsigned long, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*, unsigned long)':
gotsan.cc:(.text+0xaac1): undefined reference to `__libc_malloc'
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::InternalRealloc(void*, unsigned long, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*)':
gotsan.cc:(.text+0xca20): undefined reference to `__libc_realloc'
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::InternalFree(void*, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*)':
gotsan.cc:(.text+0x66e8): undefined reference to `__libc_free'
collect2: error: ld returned 1 exit status
FAIL	github.com/owulveryck/onnx-go [build failed]

related to golang/go#14481

Support go modules

Is your feature request related to a problem? Please describe.

onnx-go should be compatible with go modules. Actually building with go.mod breaks (see #75).

Describe the solution you'd like

Note that although multi-module repositories (the topic of the discussion, and an "advanced" setup) is tricky, single-module repositories (the normal, mainstream case) should be quite straightforward and easy to maintain.

Essentially, all that needs to be done is:

go mod init
go mod tidy
And then during every PR, you use go mod tidy to ensure you're capturing any newly-added dependencies. (CI can check for it like so: spf13/viper#706.

Happy to provide additional guidance if you'd like to discuss any part of adopting Go modules! :)

Additional context

To reproduce the issue.

main.go

package main

import (
        "fmt"
        "github.com/owulveryck/onnx-go/backend/x/gorgonnx"
)

func main() {
        backend := gorgonnx.NewGraph()
}
go mod init myproject.org/package
go build
go: finding github.com/owulveryck/onnx-go/backend/x/gorgonnx latest
go: finding github.com/owulveryck/onnx-go/backend/x latest
go: finding github.com/owulveryck/onnx-go/backend latest
go: finding github.com/owulveryck/onnx-go latest
go: finding github.com/golang/protobuf/proto latest
go: finding github.com/gogo/protobuf/proto latest
go: finding gonum.org/v1/gonum/graph/traverse latest
go: finding gonum.org/v1/gonum/graph/simple latest
go: finding gonum.org/v1/gonum/graph latest
go: finding gonum.org/v1/gonum latest
go: finding github.com/leesper/go_rng latest
go: finding github.com/awalterschulze/gographviz latest
# gorgonia.org/gorgonia
../go/pkg/mod/gorgonia.org/[email protected]/graph.go:569:2: cannot use e (type edge) as type graph.Edge in return argument:
	edge does not implement graph.Edge (missing ReversedEdge method)
../go/pkg/mod/gorgonia.org/[email protected]/node.go:437:16: n.shape.CalcStrides undefined (type tensor.Shape has no field or method CalcStrides)
../go/pkg/mod/gorgonia.org/[email protected]/node.go:756:15: cannot use e (type edge) as type graph.Edge in argument to n.g.SetEdge:
	edge does not implement graph.Edge (missing ReversedEdge method)
../go/pkg/mod/gorgonia.org/[email protected]/utils.go:147:28: undefined: tensor.InfChecker
../go/pkg/mod/gorgonia.org/[email protected]/utils.go:190:28: undefined: tensor.NaNChecker

Add BatchNormalization operator for the Gorgonia Backend

Is your feature request related to a problem? Please describe.

Batch Normalization is a very common operator when using a neural network for image classification.
Most of the pre-trained model is using it.

The implementation has been initiated in the branch batchnorm

Describe the solution you'd like
The BacthNorm should be able to run the tests described in the ONNX test backend:

go test -run=ONNX/TestBatchnorm

Warning, batchnorm's scale and biais may not have the same shape as the input; therefore, the operator should use "transparent" broadcasting.

Describe alternatives you've considered
N/A

Additional context

Make a onnx-go compatible with tiny Go

This issue will track the work done to make the package compatible with tiny-go.

The computation backend is, by now outside of this experiment.

Once the package works with tiny go, we may write a very simple backend to use it for example in webassembly.

Gorgonia backend does not handle model with several root nodes

Is your feature request related to a problem? Please describe.

I'm trying to run ssd using onnx-go. But error occured.

https://gist.github.com/mattn/f5aa1b96753c76075f2666235919a8f3

2019/05/09 11:09:55 onnx: operator  not implemented ()

Describe the solution you'd like

As far as I looked code backend/x/gorgonnx, the error seems occur at here:

if len(g.roots) != 1 {
return &onnx.ErrNotImplemented{}
}

Describe alternatives you've considered

Sorry, I can't figure out what should do here.

Implement operator Softmax for backend Gorgonia/Gorgonnx

Why is this operator needed?

This operator is needed at least to run the inception v1 model;

Implementation

Link to existing material on the backend

Expected problems?

  • Two versions of the operator exist in Gorgonia; we should decide whether we need the stable or the non-stable version
  • The Softmax operator of ONNX carries one attribute (the axis for the softmax); this attribute does not exist in gorgonia; the full implementation of the operator may require to tweak Gorgonia.

Tests

go test -run=ONNX/TestSoftmax

Is it possible to make onnx-go run distributedly?

Is your feature request related to a problem? Please describe.
I am a graduate student who needs a graduation project. I am thinking about whether it is necessary and feasible to make the program run distributedly.

Describe the solution you'd like
Possibly with the help of github.com/chrislusf/gleam ?

Gorgonnx backend panic if a node is used as input of several other nodes

I am trying to run the resnetv1 model with the model zoo executor:

✗  cd $GOPATH/src/github.com/owulveryck/onnx-go/examples/model_zoo_executor
✗  MODELDIR=./models/resnet18v1 go test

This leads to a panic:

        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x70 pc=0x1622aed]

goroutine 5 [running]:
testing.tRunner.func1(0xc000220e00)
        /Users/olivier.wulveryck/go/src/testing/testing.go:830 +0x392
panic(0x1900400, 0x20ca9f0)
        /Users/olivier.wulveryck/go/src/runtime/panic.go:522 +0x1b5
gorgonia.org/gorgonia.ApplyOp(0x1ad1320, 0xc0000a4180, 0xc000010058, 0x1, 0x1, 0x0, 0x0, 0x0)
        /Users/olivier.wulveryck/GOPROJECTS/src/gorgonia.org/gorgonia/op.go:173 +0x6d
gorgonia.org/gorgonia.Im2Col(0x0, 0xc000027900, 0x2, 0x2, 0xc000027910, 0x2, 0x2, 0xc000027920, 0x2, 0x2, ...)
        /Users/olivier.wulveryck/GOPROJECTS/src/gorgonia.org/gorgonia/nn.go:218 +0x425
gorgonia.org/gorgonia.Conv2d(0x0, 0xc0055388f0, 0xc000027900, 0x2, 0x2, 0xc000027910, 0x2, 0x2, 0xc000027920, 0x2, ...)
        /Users/olivier.wulveryck/GOPROJECTS/src/gorgonia.org/gorgonia/nn.go:259 +0x33f
gorgonia.org/gorgonia/ops/nn.Conv2d(...)
        /Users/olivier.wulveryck/GOPROJECTS/src/gorgonia.org/gorgonia/ops/nn/api_nocuda.go:11
github.com/owulveryck/onnx-go/backend/x/gorgonnx.(*conv).apply(0xc0001aec00, 0xc0001c0570, 0xc0000c8480, 0xc0001a33b0, 0x0)
        /Users/olivier.wulveryck/GOPROJECTS/src/github.com/owulveryck/onnx-go/backend/x/gorgonnx/conv.go:69 +0x51c
github.com/owulveryck/onnx-go/backend/x/gorgonnx.(*Graph).applyOperation(0xc0001c0570, 0xc0000c8480, 0x6c, 0xc006b3d1a8)
        /Users/olivier.wulveryck/GOPROJECTS/src/github.com/owulveryck/onnx-go/backend/x/gorgonnx/graph_walk.go:71 +0x1a1
github.com/owulveryck/onnx-go/backend/x/gorgonnx.(*Graph).walk(0xc0001c0570, 0x67, 0x1ac9b20, 0xc00014cf40)
        /Users/olivier.wulveryck/GOPROJECTS/src/github.com/owulveryck/onnx-go/backend/x/gorgonnx/graph_walk.go:42 +0x23e
github.com/owulveryck/onnx-go/backend/x/gorgonnx.(*Graph).PopulateExprgraph(0xc0001c0570, 0xc0d7adf4f4, 0xc000070000)
        /Users/olivier.wulveryck/GOPROJECTS/src/github.com/owulveryck/onnx-go/backend/x/gorgonnx/graph.go:75 +0x319
github.com/owulveryck/onnx-go/backend/x/gorgonnx.(*Graph).Run(0xc0001c0570, 0xc000048f70, 0x104ec38)
        /Users/olivier.wulveryck/GOPROJECTS/src/github.com/owulveryck/onnx-go/backend/x/gorgonnx/graph.go:33 +0x211
github.com/owulveryck/onnx-go/examples/model_zoo_executor.testRun(0x1ad3960, 0xc000220e00, 0x1ad2c20, 0xc0001c0570)
        /Users/olivier.wulveryck/GOPROJECTS/src/github.com/owulveryck/onnx-go/examples/model_zoo_executor/main_test.go:70 +0x35
github.com/owulveryck/onnx-go/examples/model_zoo_executor.TestModel.func3(0xc000220e00)
        /Users/olivier.wulveryck/GOPROJECTS/src/github.com/owulveryck/onnx-go/examples/model_zoo_executor/main_test.go:86 +0x4c
testing.tRunner(0xc000220e00, 0xc0001c2ed0)
        /Users/olivier.wulveryck/go/src/testing/testing.go:865 +0xc0
created by testing.(*T).Run
        /Users/olivier.wulveryck/go/src/testing/testing.go:916 +0x35a
exit status 2
FAIL    github.com/owulveryck/onnx-go/examples/model_zoo_executor       0.223s

The convolution operator has a nil node.

A very quick investigation shows that the input of the convolution is the output of the maxpool operator; and this output is the source of two nodes.
I think that this case is not properly handled by gorgonnx:

Screenshot 2019-06-27 at 16 45 48

Use semver-compatible tags for releases

Hi @owulveryck, thank you for the project, it looks awesome.

A minor note from go mod specification: in order to correctly use this module through go.mod, you have to specify tags in semver format, so instead of v0.3, use v0.3.0.

For example, pre-release of tiny-YOLO-v2, could be marked as v0.3.0-rc.1.

Again, thank you a lot!

WeightedDirectedGraph doesn't support multiple edges from one node to another

Context
I have a graph which contains the following "Sub" operation:
image
=> It subracts a value from itself
(why is it like that? It's the conversion of Tensorflow's Maximum/Minimum in ONNX 7 which is translated like that in order to support broadcast - however I would expect the behaviour to be the same on a square X*X operation)

Problem
When running the model, I get the following error:
bad arity for operation (have 1, want 2)

Cause
When looking at the implementation of WeightedDirectedGraph.SetWeightedEdge, it seems that it's not possible to have 2 identical edges from one node to another.

Solution
It's not clear how to fix that without big changes. However, it means that it can happen in other cases and prevent correct execution of many ONNX models.

Any idea on a fix?

IR used by onnx-go is several version behind onnx

The current version of IR supported by onnx-go is 3, as defined here:
https://github.com/owulveryck/onnx-go/blob/master/internal/onnx/ir/onnx.proto3.pb.go#L50

But onnx is at version 6, as defined here:
https://github.com/onnx/onnx/blob/master/onnx/onnx.proto#L93

This is causing an issue when trying to import a model saved using sklearn Python module. I get the error:

Unknown input type: UNDEFINED

Changing the opset to make it backward compatible, as explained here does not seem to help:
https://github.com/onnx/onnx/blob/master/docs/Versioning.md#released-versions

Are there any plans to upgrade the IR version supported by onnx-go?

unit testing, simple backend and Operation

I want to write unit tests for onnx-go. Something like:

onnx-go/decoder_test.go

Lines 12 to 31 in 47dce1f

type testsGraph struct {
onnx *pb.ModelProto
expected *simple.Graph
err error
}
var tests = []testGraph{}
func TestDecodeProto(t *testing.T) {
m := NewModel(simple.NewSimpleGraph())
for _, test := range tests {
err := m.decodeProto(test.onnx)
assert.Equal(t, test.err, err)
graphEqual(t, test.expected, m.backend)
}
}
func graphEqual(t *testing.T, src, dst onnx.Backend) {
// TODO compare teh graphs
}

But this fails because of an import cycle:

# github.com/owulveryck/onnx-go
import cycle not allowed in the test
package github.com/owulveryck/onnx-go (test)
        imports github.com/owulveryck/onnx-go
FAIL    github.com/owulveryck/onnx-go [setup failed]

This is because of the Operation definition:

onnx-go/backend.go

Lines 13 to 17 in 47dce1f

// Operation defined by its name and its attribute
type Operation struct {
Name string
Attributes map[string]interface{}
}

which is used in the simple backend implementation:

func (g *Graph) ApplyOperation(o onnx.Operation, n graph.Node) error {

I see three possibilities to fix this:

  • create another package to hold the Operation definition (the best option, but it's a package with a single definition);
  • copy the simple graph internally to the root inside a simple_backend_test.go file but it duplicates the code
  • using onnx_test but I'd need to export the decodeProto method, and I do not want to expose it.

WDYT? Any other idea?

cc @blackrez

Add automatic operator coverage generation in the CI

onnx-go contains a package used to generate a report of the supported operators (see testreport's godoc for more info).

The report for the gorgonia/gorgonnx backend can be generated by a running
ONNX_COVERAGE=/tmp/report.md go test from the backend/x/gorgonnx subdirectory.

The maintainers usually generate a coverage file on each PR (https://github.com/owulveryck/onnx-go/blob/master/backend/x/gorgonnx/ONNX_COVERAGE.md) but, as it is a manual operation, we cannot rely on this.

It would be good to have this file generated by the CI when a PR against master is done.

Broadcasting is consuming a lot of memory in Gorgonnx/Gorgonia

Bench

I've created this simple benchmark with the MNIST model to analyze the behavior of the code:

package onnx_test

import (
        "testing"

        "github.com/owulveryck/onnx-go"
        "github.com/owulveryck/onnx-go/backend/x/gorgonnx"
        "github.com/owulveryck/onnx-go/internal/examples/mnist"
        "gorgonia.org/tensor"
)

func BenchmarkUnmarshalBinary(b *testing.B) {
        input := tensor.New(tensor.WithShape(1, 1, 28, 28), tensor.Of(tensor.Float32))
        for n := 0; n < b.N; n++ {
                // Create a backend receiver
                backend := gorgonnx.NewGraph()
                // Create a model and set the execution backend
                model := onnx.NewModel(backend)

                // Decode it into the model
                err := model.UnmarshalBinary(mnist.GetMnist())
                if err != nil {
                        b.Fatal(err)
                }
                // Set the first input, the number depends of the model
                model.SetInput(0, input)
                err = backend.Run()
                if err != nil {
                        b.Fatal(err)
                }
        }
}

Running this with go test -bench=. -benchmem -memprofile memprofile.out -cpuprofile profile.out -benchtime=10s generates two files to decode with the go profiler.

CPU

The result for the CPU is displayed herer:
mnist cpu flamegraph

There are possible enhancements, but nothing obvious.

Memory

The result for the Memory usage is more interesting. It shows that the repeatOp of Gorgonia is using a lot of memory. The repeatOp is the foundation of the broadcasting.

Screenshot 2019-05-28 at 11 15 46

This op seems to copy a lot of data:

bla

gorgonia.Tensor

The analysis point that this function from the tensor package is involved in extra memory consumption:

https://github.com/gorgonia/tensor/blob/8eeece33868236224d51e7362e36a68642870bd2/array.go#L34-L51

Especially this call to val.Interface()

	return array{
		Header: hdr,
		t:      t,
		v:      val.Interface(),
	}

According to the comment, this field is even not mandatory by the array.

// array is the underlying generic array.
type array struct {
	storage.Header             // the header - the Go representation (a slice)
	t              Dtype       // the element type
	v              interface{} // an additional reference to the underlying slice. This is not strictly necessary, but does improve upon anything that calls .Data()
}

On top of that, the reflect package from stdlib references a TODO with something to enhance in the packEface function (packEface converts v to the empty interface. ):

		if v.flag&flagAddr != 0 {
			// TODO: pass safe boolean from valueInterface so
			// we don't need to copy if safe==true?
			c := unsafe_New(t)
			typedmemmove(t, c, ptr)
			ptr = c
		}

The safe flag is true when calling Interface() function:

// Interface returns v's current value as an interface{}.
// It is equivalent to:
//	var i interface{} = (v's underlying value)
// It panics if the Value was obtained by accessing
// unexported struct fields.
func (v Value) Interface() (i interface{}) {
	return valueInterface(v, true)
}

This suggests that avoiding the copy would significantly improve the performances.

cc @chewxy

Benchmark other onnx runtimes

Is your feature request related to a problem? Please describe.
Would be nice to know the runtime performance of onnx-go compared with the other runtimes available. Here a list of some runtimes: https://onnx.ai/supported-tools

Describe the solution you'd like
Don't need to be nothing fancy, maybe a wiki page with some models being compared between the other implementations.

For some solutions, mainly edge, performance is really a key factor, this could be a really advantage to onnx-go because of the Go speed and concurrency. It will also highlight the parts we need to dedicate more effort to make it faster.

sporadic failures of the tests within the gorgonnx package

this code:

package gorgonnx

import (
        "testing"

        "github.com/owulveryck/onnx-go/backend/testbackend"
        _ "github.com/owulveryck/onnx-go/backend/testbackend/onnx"
)

// TestONNX run the onnx's backend tests against all registered operatos
func TestONNX(t *testing.T) {
        for optype := range operators {
                for _, tc := range testbackend.GetOpTypeTests(optype) {
                        tc := tc // capture range variable
                        t.Run(tc().GetInfo(), tc().RunTest(NewGraph(), true))
                }
        }
}

executed with go test -race within the gorgonnx directory of the gorgonia branch leads to a race condition (commit 3bb7f2b).

Without the true parameter of the RunTest method, no race detected, but a weird behavior happens: now and then, one of the tests randomly fails. I suspect this is related to the race condition.

This causes the CI to fail most of the time (which is a good thing though).

[BUG] An initializer can be declared without previous reference in input/output/value

Trying to execute the yolo v3 model raise this error:

model_zoo_executor git:(master) ✗  MODELDIR=models/yolov3 go test -failfast | more
--- FAIL: TestModel (1.37s)
    --- FAIL: TestModel/Unmarshal (1.37s)
        main_test.go:56: invalid model: initializer has not been defined in input, output or value
FAIL
exit status 1
FAIL    github.com/owulveryck/onnx-go/examples/model_zoo_executor       2.306s

This is triggered by

onnx-go/decoder.go

Lines 137 to 145 in 1681b26

for _, tensorProto := range model.Graph.GetInitializer() {
name := tensorProto.GetName()
if name == "" {
return errors.New("initializer should have a name")
}
n, ok := m.dbByName[name]
if !ok {
return errors.New("invalid model: initializer has not been defined in input, output or value")
}

This error is triggered if the initializer (referenced by its name) has not been defined previously.

However, a recent change in the IR doc of onnx now specifies:

When an initializer has the same name as a graph input, it specifies a default value for that input. When an initializer has a name different from all graph inputs, it specifies a constant value.

Hence, the onnx-go parser must be fixed to create a new node in the graph if the initializer has not been defined before.

Note: by now, onnx-go only handles two types of nodes:

  • Operator
  • tensor
    We may use tensor to represent the initializer, but it will silently discard the fact that this node is constant. I propose to use the YAGNI principle and postpone the decision to mark the node as a constant.

Fix build at Raspberry PI

Is your feature request related to a problem? Please describe.
I'm having some errors during the build of the YOLO example with a Raspberry Pi.

Describe the solution you'd like
The solution is quite easy and most of the problems are related to gorgonia. I've already opened an issue. gorgonia/gorgonia#311

The only thing that I had to do at onnx-go is to update this dependency replace github.com/chewxy/math32 => github.com/chewxy/math32 v1.0.1.

Performance
I didn't any profile yet, but the performance is really slow at the Pi. It's possible to use the Pi GPU to speed up the things? On my Mac, the YOLO example is taking something like 2s and at the Pi 50s.

I did some tests with some other projects like https://github.com/shizukachan/darknet-nnpack and got something like 500ms, 100x faster.

Sporadic error in the reshape operation

From times to time, the CI fails with an error:

--- FAIL: TestReshape_Scalar (0.00s)
##[error]    reshape_test.go:43: Cannot reshape, bad output shape []float32{0, 1, 2, 3, 10000, 10001, 10002, 10003}
FAIL
FAIL	github.com/owulveryck/onnx-go/backend/x/gorgonnx	0.108s

This is to investigate.

Poor performances on ARM64

Is your feature request related to a problem? Please describe.
Onnx-go is quite slow on ARM64.

Describe the solution you'd like
Add hardware accelaration on differents library dependands on onnx-go (Gorgonia and go-num).

Describe alternatives you've considered
Use Openblas.

Additional context
I add the profile files executed on AWS A1.

profile001

Implement operator Gemm for backend Gorgonia/Gorgonnx

Why is this operator needed?

To run inception v1 (at least)

Implementation

Link to existing material on the backend

Expected problems?

Mul is highly overloaded in Gorgonia. The behavior of the operator may not match the expectation of the Gemm; it may end in a king of reimplementation based on HadamardProd, Mul and a couple of switch cases.

Tests

go test -run=ONNX/TestGemm

Installation error

When trying to install this package in a go modules enabled project(golang:latest image) I got the following error:

go: extracting github.com/tensorflow/tensorflow v1.13.1
# github.com/owulveryck/onnx-go/internal/pb-onnx
/go/pkg/mod/github.com/owulveryck/[email protected]/internal/pb-onnx/onnx.proto3.pb.go:22:11: undefined: proto.ProtoPackageIsVersion3
# gorgonia.org/gorgonia
/go/pkg/mod/gorgonia.org/[email protected]/graph.go:569:2: cannot use e (type edge) as type graph.Edge in return argument:
        edge does not implement graph.Edge (missing ReversedEdge method)
/go/pkg/mod/gorgonia.org/[email protected]/node.go:437:16: n.shape.CalcStrides undefined (type tensor.Shape has no field or method CalcStrides)
/go/pkg/mod/gorgonia.org/[email protected]/node.go:756:15: cannot use e (type edge) as type graph.Edge in argument to n.g.SetEdge:
        edge does not implement graph.Edge (missing ReversedEdge method)
/go/pkg/mod/gorgonia.org/[email protected]/utils.go:147:28: undefined: tensor.InfChecker
/go/pkg/mod/gorgonia.org/[email protected]/utils.go:190:28: undefined: tensor.NaNChecker

Add broadcast with a scalar to the Gorgonia backend (gorgonnx)

Is your feature request related to a problem? Please describe.
When running the Emotion Fer+ model with the Gorgonnx backend, the decoder fails with

onnx: operator  not implemented (broadcast not yet implemented for shape (1, 1, 64, 64), ())

Describe the solution you'd like
This is due to the a binary operation that tries to perform broadcast with a scalar.
The broadcast.go file must be adapted to handle tensor with len(t.Shape()) == 0

Describe alternatives you've considered
N/A

Additional context
To reproduce the test, do:

$ curl -s https://onnxzoo.blob.core.windows.net/models/opset_8/emotion_ferplus/emotion_ferplus.tar.gz | tar -C /tmp -xzf -```
$ export MODELDIR=/tmp/emotion_ferplus/
$ go run examples/model_zoo_executor/main.go -model $MODELDIR/model.onnx -input $MODELDIR/test_data_set_0/input_0.pb -output $MODELDIR/test_data_set_0/output_0.pb

Softmax is broken in Gorgonnx / Bad output shape

a new test has been implemented by commit 41722dc.

The softmax operator does not pass this test; therefore it is buggy:

➜ gorgonnx git:(new-tests) ✗  go test -run ONNX/TestSoftmax
--- FAIL: TestONNX (0.01s)
    --- FAIL: TestONNX/TestSoftmaxAxis1 (0.00s)
        test_structure.go:136: the two tensors doesn't have the same dimension, expected (3, 4, 5), got (3, 20)
    --- FAIL: TestONNX/TestSoftmaxAxis2 (0.00s)
        test_structure.go:136: the two tensors doesn't have the same dimension, expected (3, 4, 5), got (12, 5)
    --- FAIL: TestONNX/TestSoftmaxDefaultAxis (0.00s)
        test_structure.go:136: the two tensors doesn't have the same dimension, expected (3, 4, 5), got (3, 20)
    --- FAIL: TestONNX/TestSoftmaxAxis0 (0.00s)
        test_structure.go:136: the two tensors doesn't have the same dimension, expected (3, 4, 5), got (1, 60)
FAIL
exit status 1
FAIL    github.com/owulveryck/onnx-go/backend/x/gorgonnx        0.036s

Work in progress

About

This is just the interface between the ONNX structures/files and the Go ecosystem.
There is also an ongoing effort to implement a backend into Gorgonia which is a Computation Lib for Go.

So far, the TensorProto import is partially implemented into the tensor lib but I am waiting for more progress before I ask for a PR and a merge to the master branch.

Regarding the Model and the Graph structures, I have started a POC which is quick'n'dirty by now. (if you are interested, the code is here). My goal is to be able to run the MNIST example. So far I have generated an ExprGraph and I can run it, but the result is wrong. I am doing some bug hunting.

Next

Once the POC is working, I will do some PR, and start a complete integration process of ONNX into Gorgonia (it may need some tooling and enhanced testing).
Meanwhile, if you have any idea for enhancing the onnx-go repo, please feel free to open an issue or a PR.

cc @jspisak @prasanthpul @lupesko @bddppq

Go modules are not up to date

Is your feature request related to a problem? Please describe.
When I run ./go.test.sh on local, The go.sum and go.mod files are updated due to missing package sanity-io/litter added by PR #132

Describe the solution you'd like
Update the go.sum and go.mod files.

Evaluate the opportunity to replace gonum.DirectedWeightedBuilder in the top level Model

The Model is an encapsulation of gonum.DirectedWeightedBuilder's interface.

This gives some great flexibility for implementing a backend.
For example, you can create a very simple backend (see the example in "simple") simply to display the graph in case you do not need actually to compute the result.
For example, I am using this to create a dot representation of the onnx graph.

If the backend needs more feature, the backend can fulfill this interface, and the corresponding method is applied at runtime :

type OperationCarrier interface {
        ApplyOperation(Operation, graph.Node) error
}

The problem is that a type inference is made at runtime and this process is slow.

Proposal: I am considering to replace the embedded gonum.DirectedWeightedBuilder by a Graph interface that would look like:

type Graph interface {
       gonum.DirectedWeightedBuilder
       OperationCarrier
}

This would break the API which is, by now, not a problem.
What do you think?

Handle operators with no input (eg Constant)

Is your feature request related to a problem? Please describe.
The #116 raises an error because some operators, such as Constant do not have any input

Describe the solution you'd like
While decoding the graph in the core package, if an operator does not have any input, create a dummy node that will hold the backend of the current operator (same type, same shape).
It is the responsibility of the backend to reuse the data of the node to spare some memory.

Describe alternatives you've considered
For the constant operator, the operator could be turned into an initializer; but the problem would remain for other operators such as RandomUniform and RandomNormal

Additional context
cf Issue 2274 in the core onnx project.

Add describe info

Is your feature request related to a problem? Please describe.
N/A

Describe the solution you'd like
it could help to have functions to describe the expected shapes of input and output.
For example, in the case of image classification, the input's shape is related to the size of the picture. This could allow to easily use a pre-processing of the picture, without hardcoding it.

Getting the same info for the output can help to decide (in case of a classification algorithm).

Describe alternatives you've considered
Finding them manually.

Additional context
N/A

Analyse (and enhance) performances on multiple predictions

I have a demo model with 39 inputs.it takes 0.5s to predict 10000 data using keras.with onnx-go it takes 5s to predict.

	for i := 0; i < 10000; i++ {
		model.SetInput(0, input)
		err = backend.Run()
		model.GetOutputTensors()
	}

Am I make some mistake here?

[TinyYolo v2] Bug in maxpool?

This commit allows the model tiny Yolo v2 to be compiled and executed With Gorgonia.

Sadly the execution does not give the expected result:

➜  model_zoo_executor git:(tiny-yolov2) ✗  export MODELDIR=~/Documents/tiny_yolov2
➜  model_zoo_executor git:(tiny-yolov2) ✗ go run main.go -model $MODELDIR/model.onnx -input $MODELDIR/test_data_set_0/input_0.pb -output $MODELDIR/test_data_set_0/output_0.pb

        Error Trace:    main.go:72
                                                proc.go:200
                                                asm_amd64.s:1337
        Error:          Max difference between -0.17929432 and 0.056231752 allowed is 0.005, but difference was -0.23552606999874115
        Messages:       the two tensors should be equal.
exit status 1

According to this blog post the architecture should be:

Layer         kernel  stride  output shape
---------------------------------------------
Input                          (416, 416, 3)
Convolution    3×3      1      (416, 416, 16)
MaxPooling     2×2      2      (208, 208, 16)
Convolution    3×3      1      (208, 208, 32)
MaxPooling     2×2      2      (104, 104, 32)
Convolution    3×3      1      (104, 104, 64)
MaxPooling     2×2      2      (52, 52, 64)
Convolution    3×3      1      (52, 52, 128)
MaxPooling     2×2      2      (26, 26, 128)
Convolution    3×3      1      (26, 26, 256)
MaxPooling     2×2      2      (13, 13, 256)
Convolution    3×3      1      (13, 13, 512)
MaxPooling     2×2      1      (13, 13, 512)
Convolution    3×3      1      (13, 13, 1024)
Convolution    3×3      1      (13, 13, 1024)
Convolution    1×1      1      (13, 13, 125)
---------------------------------------------

After setting some logs, the architecture of the decoded network is:

+Convolution             (3, 3)          [1 1]           (1, 16, 416, 416)
+MaxPooling              (2, 2)          [2 2]           (1, 16, 208, 208)
+Convolution             (3, 3)          [1 1]           (1, 32, 208, 208)
+MaxPooling              (2, 2)          [2 2]           (1, 32, 104, 104)
+Convolution             (3, 3)          [1 1]           (1, 64, 104, 104)
+MaxPooling              (2, 2)          [2 2]           (1, 64, 52, 52)
+Convolution             (3, 3)          [1 1]           (1, 128, 52, 52)
+MaxPooling              (2, 2)          [2 2]           (1, 128, 26, 26)
+Convolution             (3, 3)          [1 1]           (1, 256, 26, 26)
+MaxPooling              (2, 2)          [2 2]           (1, 256, 13, 13)
+Convolution             (3, 3)          [1 1]           (1, 512, 13, 13)
-MaxPooling              (2, 2)          [1 1]           (1, 512, 14, 14)
-Convolution             (3, 3)          [1 1]           (1, 1024, 14, 14)
-Convolution             (3, 3)          [1 1]           (1, 1024, 14, 14)
-Convolution             (1, 1)          [1 1]           (1, 125, 14, 14)

The last layer using the Maxpool operator does not give the correct output size.
The padding used is computed from the auto_pad argument but seems ok (padding is [1,1]).

It requires more investigation; maybe a bug in Gorgonia.

Note : the computation is slow, but Make it work, then Make it fast

cc @chewxy

Implement SAME_UPPER auto padding for Maxpool in Gorgonnx

The auto_pad parameter is deprecated but still used by some model;

This is mandatory for running Tiny yolo v2;
a Naive implementation has been made but does not work (see issue #73 )

The goal of this issue is to follow the implementation of the auto_padding SAME_UPPER

This issue is closed when those tests pass:

go test -run="ONNX/TestMaxpool.*Upper" -v
=== RUN   TestONNX
=== RUN   TestONNX/TestMaxpool2dPrecomputedSameUpper
=== RUN   TestONNX/TestMaxpool2dSameUpper
--- FAIL: TestONNX (0.01s)
    --- FAIL: TestONNX/TestMaxpool2dPrecomputedSameUpper (0.00s)
        test_structure.go:78:
                Error Trace:    test_structure.go:135
                Error:          Max difference between 1 and 7 allowed is 1e-06, but difference was -6
                Messages:       the two tensors should be equal.
    --- FAIL: TestONNX/TestMaxpool2dSameUpper (0.00s)
        test_structure.go:78:
                Error Trace:    test_structure.go:135
                Error:          Max difference between 1.7640524 and 0.978738 allowed is 1e-06, but difference was 0.7853143811225891
                Messages:       the two tensors should be equal.
FAIL
exit status 1
FAIL    github.com/owulveryck/onnx-go/backend/x/gorgonnx        0.028s

WIP in branch maxpool-sameupper

How to regenerate the onnx tests?

Is your feature request related to a problem? Please describe.
I've updated the test template and now I need to regenerate the tests.

Describe the solution you'd like
Would be nice to have a command like make gen-tests or something like go generate ./... that could run those commands.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.