HI, Flux looks really cool and I like the simplicity of the API. I'm

A more type focused interface about flux.jl HOT 4 CLOSED

fluxml commented on May 18, 2024

A more type focused interface

from flux.jl.

Comments (4)

MikeInnes commented on May 18, 2024

Yes, great to have this discussion here, thanks for opening.

I agree completely about the black-boxy-ness of @net, and I'd like to rework the docs to present things in a simpler way. Really, the core feature is being able to write an annotated definition like @net f(xs) = xs .* xs and have it run on a backend like TensorFlow. This process is necessarily somewhat magical, but it has a single, conceptually simple purpose, while the rest of Flux (training etc) should be very boring Julia code.

Currently @net type ... adds some orthogonal sugar on top of that. Some of it is pretty superfluous, like <: Flux.Model. The right approach is probably to explain the basics without it, then show where it adds extra convenience.

RE backends. The code eval stuff is really only a lazy-loading technique; if we simply included the backend files it would behave the same but force a dependency on both backend libs, which is unreasonable for the user. But it's not a great long-term solution; better might be to have backends be separate packages that are explicitly loaded, which would obviate all the eval stuff.

That aside, compile(::MXNet, x) is the same as compilemx(x) (which is equivalent to the current situation), unless you want to implement compile(::Any, x::Foo). I can't think of a use case for this and suspect the lazy loading is really the core issue here, but I'm happy to be corrected. Generally, if folks want to implement other backends or anything related, I'm more than happy to provide examples or fix things up as needed.

is there an easy way to reuse outputs more than once outside of the @net syntax?

Can you elaborate on this question?

from flux.jl.

rofinn commented on May 18, 2024

The right approach is probably to explain the basics without it, then show where it adds extra convenience.

I think that would help clarify things a lot. If I have time I'll try and work through some of the examples in the docs without @net to try and help compare the two options.

But it's not a great long-term solution; better might be to have backends be separate packages that are explicitly loaded, which would obviate all the eval stuff.

Yeah, I've also had issues with wanting lazy loading in julia. I know packages like Extern.jl aim to address this in a more general way by placing your code in a macro block without changing it, but it's essentially doing the same thing.

... compile(::MXNet, x) is the same as compilemx(x) (which is equivalent to the current situation), unless you want to implement compile(::Any, x::Foo)...

I mostly just thought that compile(::Flux.Backend, ::Flux.Model) provides a cleaner API, so folks can have code that compiles models independent of what type of backend they're using. For example, if I have a project that compiles a bunch of different models it would be nice if switching backends was as simple as creating a different Flux.Backend type at the beginning of my program and all my functions that compile models could just take a Flux.Backend and call compile(backend, model) without caring what type of backend it's using (rather than needing to change all the lines with compilemx(model) to compiletf(model)).

is there an easy way to reuse outputs more than once outside of the @net syntax?

Can you elaborate on this question?

Sorry, I was just referring to the docs where it says

For simple networks Chain is completely fine, although the @net version is more powerful as we can (for example) reuse the output l1 more than once.

but it doesn't really explain how @net can reuse the output. I think this is another case where showing how you might achieve similar behaviour without using @net would help clarify what @net is doing and better demonstrate why it's so useful.

from flux.jl.

MikeInnes commented on May 18, 2024

I mostly just thought that compile(::Flux.Backend, ::Flux.Model) provides a cleaner API ...

This is an interesting use case I hadn't thought about as much. Still though, I think it works out the same either way; you can think of the compiletf function itself as representing the backend, but you just have to use call in place of compile:

global backend = compiletf
# later on
backend(mymodel)

One thing I could see coming up, though, is that we might want to provide a vocabulary over backends beyond compile. For example you could imagine hypothetical initialise(::MXNet) functions. I can't think of any immediate use case for this but it'd be a good reason to represent backends explicitly.

but it doesn't really explain how @net can reuse the output.

Ah ok, so this is just unclear writing on my part. Chain is function composition like |> or ∘; instead of writing h(x) = g(f(x)) you can write h = g ∘ f. This is convenient, but limited; you can't define this version of h in the same way:

function h(x)
  temp = f(x)
  return temp, g(temp)
end

This is all I meant by reusing outputs; storing and reusing the output of f more than once, rather than piping it straight into g.

It's possible to create more function combinators that can express this kind of logic, but much more intuitive to just write it out with normal Julia syntax; that's the power of @net.

from flux.jl.

MikeInnes commented on May 18, 2024

I've just reworked the docs along these lines, so hopefully it should be easier to follow. I also added some internal docs, and you can check out how I've made the test code generic across backends here.

I'll close this for now, hopefully that clears some things up, but please do let me know if anything else needs clarifying – or any other feedback is welcome.

from flux.jl.

A more type focused interface about flux.jl HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent