Giter Club home page Giter Club logo

diffrules.jl's People

Contributors

agerlach avatar andreasnoack avatar chrisrackauckas avatar cossio avatar dependabot[bot] avatar devmotion avatar github-actions[bot] avatar haampie avatar jdlangs avatar jrevels avatar juliatagbot avatar keno avatar kristofferc avatar mcabbott avatar mewilhel avatar moelf avatar oxinabox avatar ptiede avatar ranocha avatar simsurace avatar stefankarpinski avatar stevengj avatar thebhatman avatar tkoolen avatar torfjelde avatar tpapp avatar viralbshah avatar wesselb avatar willtebbutt avatar wookay avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

diffrules.jl's Issues

no type annotations

Note that differentiation rules are purely symbolic, so no type annotations should be used.

should some of the differentiation rules like (+)(a,b) -> (1, 1) use the one(a), function instead? Some types might deliberately not implement implicit integer conversion.

Cannot install DiffRules

Cannot use Pkg.add("DiffRules") to install the package, Julia returns ERROR: unknown package DiffRules.

closed under packages?

I'm willing to define some rules for SpecialFunctions, however some of the derivatives require functions not implemented in SpecialFunctions but in say GSL.jl. Is there any style convention for only allowing rules that are closed under derivation in the same package? I suppose the worst that can happen is that a user would get an undef error if they try to use a rule that requires GSL.jl but they have not loaded it before.

Cannot backprop `x^a` when `x` is negative

I meet an AD problem in Flux.Tracker. Basically one cannot backprop x^a when x is negative, no matter whether or not a is even or odd.

To be more specific, this works

Tracker.gradient((x) -> x^2, 2.0) # => (4.0 (tracked),)

but this gives domain error

Tracker.gradient((x) -> x^2, -2.0)

ERROR: DomainError with log:
-2.0 will only return a complex result if called with a complex argument. Try -2.0(Complex(x)).
Stacktrace:
 [1] throw_complex_domainerror(::Float64, ::Symbol) at ./math.jl:31
 [2] log(::Float64) at ./special/log.jl:285
 [3] _forward at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/scalar.jl:55 [inlined]
 [4] #track#1 at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/Tracker.jl:50 [inlined]
 [5] track at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/Tracker.jl:50 [inlined]
 [6] log at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/scalar.jl:57 [inlined]
 [7] #218 at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/scalar.jl:66 [inlined]
 [8] back_(::Flux.Tracker.Grads, ::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##218#219")){Flux.Tracker.TrackedReal{Float64},Int64},Tuple{Flux.Tracker.Tracked{Float64},Nothing}}, ::Int64) at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:103
 [9] back(::Flux.Tracker.Grads, ::Flux.Tracker.Tracked{Float64}, ::Int64) at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:118
 [10] #6 at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:131 [inlined]
 [11] (::getfield(Flux.Tracker, Symbol("##9#11")){getfield(Flux.Tracker, Symbol("##6#7")){Flux.Tracker.Params,Flux.Tracker.TrackedReal{Float64}}})(::Int64) at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:140
 [12] gradient(::Function, ::Float64) at /Users/kai/.julia/packages/Flux/UHjNa/src/tracker/back.jl:152
 [13] top-level scope at none:0

.

I'm posting this issue here becuase I think this is related to https://github.com/JuliaDiff/DiffRules.jl/blob/master/src/rules.jl#L83. When x is negative, log(x) will of course throw an error. Is this behaviour expected?

BTW ForwardDiff doesn't seem to have this problem.

Problems with the `DEFINED_DIFFRULES` implementation

Many "introspection" functions in DiffRules like DiffRules.diffrules look at the global variable DEFINED_DIFFRULES to collect information about what diff rules are defined. I think this has some issues:

  • Only diff rules defined inside DiffRules.jl will be inside DEFINED_DIFFRULES. Packages that use @define_diffule will not be in that list. That is because side effects like adding to a global variable are not visible when modified during the time a package gets precompiled. You would have to add stuff to it in __init__ for it to be visible.
  • Accesses are made with a symbol as an argument, for example, DiffRules.hasdiffrule(:Base, :sin, 1). This feels to me like it should be made with a module as first argument.

I encounter this when I try to make e.g. LogExpFunctions into an extension (which makes it a separate module). I'm trying to think of ways to improve this that are backwards compatible.

Version info is confusing

The last released version of DiffRules.jl has version number 1.0.2 (according to github):

grafik

The master branch of DiffRules.jl has version number 1.2.1 in the Project.toml file.

Are there more registered versions of DiffRules, as shown in this github repository?

`log1p` getting overwritten

There seems to be an extra method from another library here. At least on 0.7 this results in the methods for log1p getting overwritten. Would it be ok to either delete this line or, if it's needed for 0.6, move it to within the conditional?

diffrule for fma

It would be great to have a diffrule for fma. I would make a PR but I don't know how to define ternary rules.

rule for GSL functions

Hi, I am trying to add

@define_diffrule GSL.sf_fermi_dirac_half(x)          = :(GSL.sf_fermi_dirac_mhalf($x))

as a diffrule. To make this work I need to import GSL to ForwardDiff, creating a dependency on GSL which likely should be avoided. It indeed seems to work well.

Is there another way to proceed here ?

Scalar DiffRules which use the output

Noting that stuff like

y = exp(x)
∂/∂x exp(x) = y

happens, it might make sense to augment the set of scalar diffrules with extra code that optionally allows a user to provide a symbol for the output of the call. This is potentially useful for forwards-mode and reverse-mode stuff. That said, I can imagine that a lot instances of this kind of thing might be possible to optimise away via a compiler pass, so maybe this isn't such a valuable thing to provide?

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

Lack rule for `ifelse`

I was wondering whether a rule for ifelse(condition, exp1, exp2) can be created? Thanks a lot!

Supporting Linear Algebraic Primitives

(continued from invenia/Nabla.jl#81, cc @willtebbutt)

This means that we would need a slightly more general interface for linear algebra, and would certainly require different forward- and reverse- mode expressions, than is currently provided by DiffRules.

Agreed, DiffRules only properly handles scalar kernels now. To support linear algebra, we need to add a notion of tensor/scalar, allowing in-place methods, marking adjoint variables, etc. to DiffRules.

Regarding the where possible statement above, there are certainly operations (such as the Cholesky factorisation) for which symbolic expressions might be a bit unweildy (see here). Now that I think about this a bit more, I'm not sure whether these larger implementations are going to be problematic or not.

Anyway, my point is that given symbolic expressions for the linear algebra operations I agree that it's reasonable to hope that compiler optimisations can eliminate redundant code when compiling custom kernel implementations, and that this is a significantly better idea than hand-coding lots of optimisations. (The issue I linked in my previous comment is a good example of this. I would definitely prefer to be able to just forget about this). However, I would contend that you simply can't handle linear algebra properly without a number of hand-coded symbolic expressions for the forward- and reverse-mode sensitivities because they aren't written in Julia. If at some point in the future we have native Julia implementation of (for example) LAPACK, then it would be a really good idea to try and produce an AD tool which is able to produce reasonably-well optimised kernels for each operation. To the best of my knowledge, we shouldn't expect this to happen any time soon (and almost certainly never for BLAS), so a symbolic version of the current implementation of DiffLinearAlgebra will be necessary for Capstan to be able to differentiate arbitrary Julia code even reasonably efficiently.

I think there might've been a misunderstanding with my previous post 😛I definitely am not arguing that we should express e.g. complex LAPACK kernels symbolically, and I didn't mean to imply that DiffRules/DiffLinearAlgebra were directly competing approaches. On the contrary, I think they're quite complementary - if DiffLinearAlgebra didn't exist, I eventually would need to make a "DiffKernels.jl" anyway. DiffRules is useful for mapping primal functions to derivative functions, and is thus useful when generating e.g. instruction primitives/computation graphs within downstream tools (i.e. it solves the problem "what kernels should I call and how should I call them?"). DiffLinearAlgebra (as it stands) is useful for providing implementations of these kernels (i.e. solves the problem "how do I execute the kernels that I'm calling?"). They're both necessary components of the AD ecosystem.

As for deciding what computations should be primitives, I think we're already on the same page; a computation should be defined as a primitive if either/both of the following applies:

  1. it is difficult to express the computation as a composition of existing primitives
  2. a hand-optimized kernel for the computation is sufficiently more performant than the equivalent composition of existing primitives, even after taking into account potential compiler-level optimizations.

Type instability in ldexp with Float32 arguments

Hi,

This is a cross-post with JuliaDiff/ForwardDiff.jl#604.
The rule for ldexp(x,y) always returns a Float64 regardless of the type of the first argument x.
This is because the rule is given by

@define_diffrule Base.ldexp(x, y)  = :( exp2($y) ), :NaN

and since y is an Int exp2 by default return a Float64. @mcabbott pointed out that if we change the rule to

@define_diffrule Base.ldexp(x, y)  = :(  oftype(float($x), exp2($y)  ), :NaN

then we should maintain the Float32 type throughout the computation.

Rule for `SpecialFunctions.beta_inc`

The SpecialFunctions package offers the incomplete beta function as beta_inc. For reasons I don't really understand, this function returns a tuple of $I_x(a,b)$ and $1-I_x(a,b)$, so there are perhaps some design questions, but the derivative is computable in closed form, and there are already rules for SpecialFunctions.beta, so I don't think it would be too difficult. At the moment with DiffRules 1.0.2, ForwardDiff.derivative(a->SpecialFunctions.beta_inc(a, 1.0, 0.3), 1.0) causes a stack overflow, so it seems like defining a rule for this would have some value.

I'm happy to contribute a rule here, but I thought I'd just touch base about the best way to handle the tuple return type. Perhaps it would be best to just locally define a beta_inc1(a,b,x) = beta_inc(a,b,x)[1] and implement the derivative manually with DiffRules.@define_diffrule rather than trying to deal with the tuple return type. Interested to hear thoughts.

(edited because I mis-wrote something earlier about the derivatives being computed with the beta distribution pdf)

Release a major?

Updating SpecialFunctions requires another big bump... this gets annoying. @jrevels @oxinabox how about just making a 1.0 here for this? I don't think it'll change all that much.

clarify @define_diffrule namespaces

I think that eg

@define_diffrule SpecialFunctions.lgamma(x) = :( digamma($x) )

should be

@define_diffrule SpecialFunctions.lgamma(x) = :( SpecialFunctions.digamma($x) )

since the RHS is not necessarily evaluated in the module of the LHS. Is this correct?

Just please clarify, I would be happy to make a PR.

(cf JuliaStats/StatsFuns.jl#49)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.