Comments (4)
Yes, great to have this discussion here, thanks for opening.
I agree completely about the black-boxy-ness of @net
, and I'd like to rework the docs to present things in a simpler way. Really, the core feature is being able to write an annotated definition like @net f(xs) = xs .* xs
and have it run on a backend like TensorFlow. This process is necessarily somewhat magical, but it has a single, conceptually simple purpose, while the rest of Flux (training etc) should be very boring Julia code.
Currently @net type ...
adds some orthogonal sugar on top of that. Some of it is pretty superfluous, like <: Flux.Model
. The right approach is probably to explain the basics without it, then show where it adds extra convenience.
RE backends. The code eval stuff is really only a lazy-loading technique; if we simply included the backend files it would behave the same but force a dependency on both backend libs, which is unreasonable for the user. But it's not a great long-term solution; better might be to have backends be separate packages that are explicitly loaded, which would obviate all the eval stuff.
That aside, compile(::MXNet, x)
is the same as compilemx(x)
(which is equivalent to the current situation), unless you want to implement compile(::Any, x::Foo)
. I can't think of a use case for this and suspect the lazy loading is really the core issue here, but I'm happy to be corrected. Generally, if folks want to implement other backends or anything related, I'm more than happy to provide examples or fix things up as needed.
is there an easy way to reuse outputs more than once outside of the
@net
syntax?
Can you elaborate on this question?
from flux.jl.
The right approach is probably to explain the basics without it, then show where it adds extra convenience.
I think that would help clarify things a lot. If I have time I'll try and work through some of the examples in the docs without @net
to try and help compare the two options.
But it's not a great long-term solution; better might be to have backends be separate packages that are explicitly loaded, which would obviate all the eval stuff.
Yeah, I've also had issues with wanting lazy loading in julia. I know packages like Extern.jl aim to address this in a more general way by placing your code in a macro block without changing it, but it's essentially doing the same thing.
... compile(::MXNet, x) is the same as compilemx(x) (which is equivalent to the current situation), unless you want to implement compile(::Any, x::Foo)...
I mostly just thought that compile(::Flux.Backend, ::Flux.Model)
provides a cleaner API, so folks can have code that compiles models independent of what type of backend they're using. For example, if I have a project that compiles a bunch of different models it would be nice if switching backends was as simple as creating a different Flux.Backend
type at the beginning of my program and all my functions that compile models could just take a Flux.Backend
and call compile(backend, model)
without caring what type of backend it's using (rather than needing to change all the lines with compilemx(model)
to compiletf(model)
).
is there an easy way to reuse outputs more than once outside of the @net syntax?
Can you elaborate on this question?
Sorry, I was just referring to the docs where it says
For simple networks Chain is completely fine, although the @net version is more powerful as we can (for example) reuse the output l1 more than once.
but it doesn't really explain how @net
can reuse the output. I think this is another case where showing how you might achieve similar behaviour without using @net
would help clarify what @net
is doing and better demonstrate why it's so useful.
from flux.jl.
I mostly just thought that
compile(::Flux.Backend, ::Flux.Model)
provides a cleaner API ...
This is an interesting use case I hadn't thought about as much. Still though, I think it works out the same either way; you can think of the compiletf
function itself as representing the backend, but you just have to use call
in place of compile
:
global backend = compiletf
# later on
backend(mymodel)
One thing I could see coming up, though, is that we might want to provide a vocabulary over backends beyond compile
. For example you could imagine hypothetical initialise(::MXNet)
functions. I can't think of any immediate use case for this but it'd be a good reason to represent backends explicitly.
but it doesn't really explain how
@net
can reuse the output.
Ah ok, so this is just unclear writing on my part. Chain
is function composition like |>
or ∘
; instead of writing h(x) = g(f(x))
you can write h = g ∘ f
. This is convenient, but limited; you can't define this version of h
in the same way:
function h(x)
temp = f(x)
return temp, g(temp)
end
This is all I meant by reusing outputs; storing and reusing the output of f
more than once, rather than piping it straight into g
.
It's possible to create more function combinators that can express this kind of logic, but much more intuitive to just write it out with normal Julia syntax; that's the power of @net
.
from flux.jl.
I've just reworked the docs along these lines, so hopefully it should be easier to follow. I also added some internal docs, and you can check out how I've made the test code generic across backends here.
I'll close this for now, hopefully that clears some things up, but please do let me know if anything else needs clarifying – or any other feedback is welcome.
from flux.jl.
Related Issues (20)
- Illegal Memory Access Error During Gradient Calculation of predefined losses on GPU RTX 4050 HOT 1
- Unnecessarily using shared GPU memory HOT 8
- Flux installation error under Julia 1.10 on Apple Silicon HOT 2
- Given that DataLoader implements `length` shouldn't it also be able to provide size? HOT 4
- The dedicated tutorial on DataLoader is missing HOT 2
- Incorrect link on docs HOT 4
- Hard error using dice loss HOT 2
- Compilation time of Flux models HOT 1
- Flux.setup buggy and broken in latest v.0.13.17 HOT 3
- example for using apple GPU with flux HOT 4
- Dimensions check for `Conv` is incomplete, leading to confusing error HOT 1
- 2x performance regression due to 5e80211c3302b5e7b79b4f670498f5a68af6659b HOT 2
- Why is Flux.destructure type unstable? HOT 3
- bad formatting for PairwiseFusion docstring HOT 1
- Zero-sized arrays cannot be applied to Dense layers. HOT 4
- Adding Simple Recurrent Unit as a recurrent layer
- Collecting PyTorch -> Flux migration notes
- tests are failing due to ComponentArrays HOT 2
- deprecate Flux.params HOT 7
- Significant time spent moving medium-size arrays to GPU, type instability HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flux.jl.