Giter Club home page Giter Club logo

Comments (4)

MikeInnes avatar MikeInnes commented on May 18, 2024

Yes, great to have this discussion here, thanks for opening.

I agree completely about the black-boxy-ness of @net, and I'd like to rework the docs to present things in a simpler way. Really, the core feature is being able to write an annotated definition like @net f(xs) = xs .* xs and have it run on a backend like TensorFlow. This process is necessarily somewhat magical, but it has a single, conceptually simple purpose, while the rest of Flux (training etc) should be very boring Julia code.

Currently @net type ... adds some orthogonal sugar on top of that. Some of it is pretty superfluous, like <: Flux.Model. The right approach is probably to explain the basics without it, then show where it adds extra convenience.

RE backends. The code eval stuff is really only a lazy-loading technique; if we simply included the backend files it would behave the same but force a dependency on both backend libs, which is unreasonable for the user. But it's not a great long-term solution; better might be to have backends be separate packages that are explicitly loaded, which would obviate all the eval stuff.

That aside, compile(::MXNet, x) is the same as compilemx(x) (which is equivalent to the current situation), unless you want to implement compile(::Any, x::Foo). I can't think of a use case for this and suspect the lazy loading is really the core issue here, but I'm happy to be corrected. Generally, if folks want to implement other backends or anything related, I'm more than happy to provide examples or fix things up as needed.

is there an easy way to reuse outputs more than once outside of the @net syntax?

Can you elaborate on this question?

from flux.jl.

rofinn avatar rofinn commented on May 18, 2024

The right approach is probably to explain the basics without it, then show where it adds extra convenience.

I think that would help clarify things a lot. If I have time I'll try and work through some of the examples in the docs without @net to try and help compare the two options.

But it's not a great long-term solution; better might be to have backends be separate packages that are explicitly loaded, which would obviate all the eval stuff.

Yeah, I've also had issues with wanting lazy loading in julia. I know packages like Extern.jl aim to address this in a more general way by placing your code in a macro block without changing it, but it's essentially doing the same thing.

... compile(::MXNet, x) is the same as compilemx(x) (which is equivalent to the current situation), unless you want to implement compile(::Any, x::Foo)...

I mostly just thought that compile(::Flux.Backend, ::Flux.Model) provides a cleaner API, so folks can have code that compiles models independent of what type of backend they're using. For example, if I have a project that compiles a bunch of different models it would be nice if switching backends was as simple as creating a different Flux.Backend type at the beginning of my program and all my functions that compile models could just take a Flux.Backend and call compile(backend, model) without caring what type of backend it's using (rather than needing to change all the lines with compilemx(model) to compiletf(model)).

is there an easy way to reuse outputs more than once outside of the @net syntax?

Can you elaborate on this question?

Sorry, I was just referring to the docs where it says

For simple networks Chain is completely fine, although the @net version is more powerful as we can (for example) reuse the output l1 more than once.

but it doesn't really explain how @net can reuse the output. I think this is another case where showing how you might achieve similar behaviour without using @net would help clarify what @net is doing and better demonstrate why it's so useful.

from flux.jl.

MikeInnes avatar MikeInnes commented on May 18, 2024

I mostly just thought that compile(::Flux.Backend, ::Flux.Model) provides a cleaner API ...

This is an interesting use case I hadn't thought about as much. Still though, I think it works out the same either way; you can think of the compiletf function itself as representing the backend, but you just have to use call in place of compile:

global backend = compiletf
# later on
backend(mymodel)

One thing I could see coming up, though, is that we might want to provide a vocabulary over backends beyond compile. For example you could imagine hypothetical initialise(::MXNet) functions. I can't think of any immediate use case for this but it'd be a good reason to represent backends explicitly.

but it doesn't really explain how @net can reuse the output.

Ah ok, so this is just unclear writing on my part. Chain is function composition like |> or ; instead of writing h(x) = g(f(x)) you can write h = g ∘ f. This is convenient, but limited; you can't define this version of h in the same way:

function h(x)
  temp = f(x)
  return temp, g(temp)
end

This is all I meant by reusing outputs; storing and reusing the output of f more than once, rather than piping it straight into g.

It's possible to create more function combinators that can express this kind of logic, but much more intuitive to just write it out with normal Julia syntax; that's the power of @net.

from flux.jl.

MikeInnes avatar MikeInnes commented on May 18, 2024

I've just reworked the docs along these lines, so hopefully it should be easier to follow. I also added some internal docs, and you can check out how I've made the test code generic across backends here.

I'll close this for now, hopefully that clears some things up, but please do let me know if anything else needs clarifying – or any other feedback is welcome.

from flux.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.