juliadiff / diffrules.jl Goto Github PK
View Code? Open in Web Editor NEWA simple shared suite of common derivative definitions
License: Other
A simple shared suite of common derivative definitions
License: Other
Note that differentiation rules are purely symbolic, so no type annotations should be used.
should some of the differentiation rules like (+)(a,b) -> (1, 1) use the one(a), function instead? Some types might deliberately not implement implicit integer conversion.
Cannot use Pkg.add("DiffRules")
to install the package, Julia returns ERROR: unknown package DiffRules
.
Distributions.jl is failing CI on Julia v0.7, which seems to be caused by DiffRules:
https://travis-ci.org/JuliaStats/Distributions.jl/jobs/303650559#L1610
Maybe just tag a new release?
One could simply revert #30 to fix this, but if one wants CSE then it can be calculated from first principles. I started a discussion which has numerical examples.
I'm willing to define some rules for SpecialFunctions
, however some of the derivatives require functions not implemented in SpecialFunctions
but in say GSL.jl
. Is there any style convention for only allowing rules that are closed under derivation in the same package? I suppose the worst that can happen is that a user would get an undef error if they try to use a rule that requires GSL.jl
but they have not loaded it before.
I meet an AD problem in Flux.Tracker. Basically one cannot backprop x^a
when x
is negative, no matter whether or not a is even or odd.
To be more specific, this works
Tracker.gradient((x) -> x^2, 2.0) # => (4.0 (tracked),)
but this gives domain error
Tracker.gradient((x) -> x^2, -2.0)
ERROR: DomainError with log:
-2.0 will only return a complex result if called with a complex argument. Try -2.0(Complex(x)).
Stacktrace:
[1] throw_complex_domainerror(::Float64, ::Symbol) at ./math.jl:31
[2] log(::Float64) at ./special/log.jl:285
[3] _forward at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/scalar.jl:55 [inlined]
[4] #track#1 at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/Tracker.jl:50 [inlined]
[5] track at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/Tracker.jl:50 [inlined]
[6] log at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/scalar.jl:57 [inlined]
[7] #218 at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/scalar.jl:66 [inlined]
[8] back_(::Flux.Tracker.Grads, ::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##218#219")){Flux.Tracker.TrackedReal{Float64},Int64},Tuple{Flux.Tracker.Tracked{Float64},Nothing}}, ::Int64) at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:103
[9] back(::Flux.Tracker.Grads, ::Flux.Tracker.Tracked{Float64}, ::Int64) at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:118
[10] #6 at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:131 [inlined]
[11] (::getfield(Flux.Tracker, Symbol("##9#11")){getfield(Flux.Tracker, Symbol("##6#7")){Flux.Tracker.Params,Flux.Tracker.TrackedReal{Float64}}})(::Int64) at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:140
[12] gradient(::Function, ::Float64) at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:152
[13] top-level scope at none:0
.
I'm posting this issue here becuase I think this is related to https://github.com/JuliaDiff/DiffRules.jl/blob/master/src/rules.jl#L83. When x
is negative, log(x)
will of course throw an error. Is this behaviour expected?
BTW ForwardDiff doesn't seem to have this problem.
Many "introspection" functions in DiffRules like DiffRules.diffrules
look at the global variable DEFINED_DIFFRULES
to collect information about what diff rules are defined. I think this has some issues:
DEFINED_DIFFRULES
. Packages that use @define_diffule
will not be in that list. That is because side effects like adding to a global variable are not visible when modified during the time a package gets precompiled. You would have to add stuff to it in __init__
for it to be visible.DiffRules.hasdiffrule(:Base, :sin, 1)
. This feels to me like it should be made with a module as first argument.I encounter this when I try to make e.g. LogExpFunctions into an extension (which makes it a separate module). I'm trying to think of ways to improve this that are backwards compatible.
https://docs.julialang.org/en/v1.0/base/math/#Base.log-Tuple{Number,Number}
We may use Base.log(2, 4) and get 2 as the answer.
There seems to be an extra method from another library here. At least on 0.7 this results in the methods for log1p
getting overwritten. Would it be ok to either delete this line or, if it's needed for 0.6, move it to within the conditional?
It would be great to have a diffrule for fma
. I would make a PR but I don't know how to define ternary rules.
Hi, I am trying to add
@define_diffrule GSL.sf_fermi_dirac_half(x) = :(GSL.sf_fermi_dirac_mhalf($x))
as a diffrule. To make this work I need to import GSL to ForwardDiff, creating a dependency on GSL which likely should be avoided. It indeed seems to work well.
Is there another way to proceed here ?
https://juliadiff.org/DiffRules.jl/stable/ shows v0.0.10, but the latest version is v1.15.1.
Noting that stuff like
y = exp(x)
∂/∂x exp(x) = y
happens, it might make sense to augment the set of scalar diffrules with extra code that optionally allows a user to provide a symbol for the output of the call. This is potentially useful for forwards-mode and reverse-mode stuff. That said, I can imagine that a lot instances of this kind of thing might be possible to optimise away via a compiler pass, so maybe this isn't such a valuable thing to provide?
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
I was wondering whether a rule for ifelse(condition, exp1, exp2)
can be created? Thanks a lot!
Should we enforce that the derivatives of max(x, y)
and min(x, y)
are undefined when x == y
?
(continued from invenia/Nabla.jl#81, cc @willtebbutt)
This means that we would need a slightly more general interface for linear algebra, and would certainly require different forward- and reverse- mode expressions, than is currently provided by DiffRules.
Agreed, DiffRules only properly handles scalar kernels now. To support linear algebra, we need to add a notion of tensor/scalar, allowing in-place methods, marking adjoint variables, etc. to DiffRules.
Regarding the where possible statement above, there are certainly operations (such as the Cholesky factorisation) for which symbolic expressions might be a bit unweildy (see here). Now that I think about this a bit more, I'm not sure whether these larger implementations are going to be problematic or not.
Anyway, my point is that given symbolic expressions for the linear algebra operations I agree that it's reasonable to hope that compiler optimisations can eliminate redundant code when compiling custom kernel implementations, and that this is a significantly better idea than hand-coding lots of optimisations. (The issue I linked in my previous comment is a good example of this. I would definitely prefer to be able to just forget about this). However, I would contend that you simply can't handle linear algebra properly without a number of hand-coded symbolic expressions for the forward- and reverse-mode sensitivities because they aren't written in Julia. If at some point in the future we have native Julia implementation of (for example) LAPACK, then it would be a really good idea to try and produce an AD tool which is able to produce reasonably-well optimised kernels for each operation. To the best of my knowledge, we shouldn't expect this to happen any time soon (and almost certainly never for BLAS), so a symbolic version of the current implementation of DiffLinearAlgebra will be necessary for Capstan to be able to differentiate arbitrary Julia code even reasonably efficiently.
I think there might've been a misunderstanding with my previous post 😛I definitely am not arguing that we should express e.g. complex LAPACK kernels symbolically, and I didn't mean to imply that DiffRules/DiffLinearAlgebra were directly competing approaches. On the contrary, I think they're quite complementary - if DiffLinearAlgebra didn't exist, I eventually would need to make a "DiffKernels.jl" anyway. DiffRules is useful for mapping primal functions to derivative functions, and is thus useful when generating e.g. instruction primitives/computation graphs within downstream tools (i.e. it solves the problem "what kernels should I call and how should I call them?"). DiffLinearAlgebra (as it stands) is useful for providing implementations of these kernels (i.e. solves the problem "how do I execute the kernels that I'm calling?"). They're both necessary components of the AD ecosystem.
As for deciding what computations should be primitives, I think we're already on the same page; a computation should be defined as a primitive if either/both of the following applies:
Hi,
This is a cross-post with JuliaDiff/ForwardDiff.jl#604.
The rule for ldexp(x,y)
always returns a Float64 regardless of the type of the first argument x
.
This is because the rule is given by
@define_diffrule Base.ldexp(x, y) = :( exp2($y) ), :NaN
and since y
is an Int
exp2
by default return a Float64. @mcabbott pointed out that if we change the rule to
@define_diffrule Base.ldexp(x, y) = :( oftype(float($x), exp2($y) ), :NaN
then we should maintain the Float32 type throughout the computation.
The SpecialFunctions
package offers the incomplete beta function as beta_inc
. For reasons I don't really understand, this function returns a tuple of SpecialFunctions.beta
, so I don't think it would be too difficult. At the moment with DiffRules 1.0.2, ForwardDiff.derivative(a->SpecialFunctions.beta_inc(a, 1.0, 0.3), 1.0)
causes a stack overflow, so it seems like defining a rule for this would have some value.
I'm happy to contribute a rule here, but I thought I'd just touch base about the best way to handle the tuple return type. Perhaps it would be best to just locally define a beta_inc1(a,b,x) = beta_inc(a,b,x)[1]
and implement the derivative manually with DiffRules.@define_diffrule
rather than trying to deal with the tuple return type. Interested to hear thoughts.
(edited because I mis-wrote something earlier about the derivatives being computed with the beta distribution pdf)
I tried to fix JuliaDiff/ForwardDiff.jl#261 with
@define_diffrule Base.Math.JuliaLibm.log1p(x) = :(inv($x + 1))
but for some reason the macro does not work, can you please take a look at it?
JuliaStats/LogExpFunctions.jl#49
I can try to make a PR? is there any example of similar function?
I think that eg
@define_diffrule SpecialFunctions.lgamma(x) = :( digamma($x) )
should be
@define_diffrule SpecialFunctions.lgamma(x) = :( SpecialFunctions.digamma($x) )
since the RHS is not necessarily evaluated in the module of the LHS. Is this correct?
Just please clarify, I would be happy to make a PR.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.