Giter Club home page Giter Club logo

Comments (10)

jrevels avatar jrevels commented on May 16, 2024

If memory serves, a pervious version of ForwardDiff supported functions that looked like this:

Unless I slipped up somewhere, ForwardDiff v0.1.0 should strictly support all the features of the previous version, plus a few more. There's even a deprecation wrapper that allows all of the old methods to still be used.

How were you doing this with the old version?

Is there any way take the Jacobian of a vector-valued function written using the result placement style?

FAD techniques mainly rely on sneakily passing overloaded types into your function, so if you use a naively-typed of Vector for xout (e.g. Vector{Float64}), InexactErrors are going to get thrown when your functions tries to store instances of ForwardDiff's types in xout.

So, you can accomplish this by passing in an xout that is typed appropriately:

using ForwardDiff

function sphere2cart!(xin::Vector, xout::Vector)
    rho, theta, phi = xin
    rho_sin_phi = rho * sin(phi)
    xout[1] = rho_sin_phi * cos(theta)
    xout[2] = rho_sin_phi * sin(theta)
    xout[3] = rho * cos(phi)
    return xout # added this line so that it's truly a Vector --> Vector function
end

# The output vector needs to be able to store ForwardDiff's special
# overloaded number type
const my_xout = Vector{ForwardDiff.GradNumTup{3,Float64}}(3)

sphere2cart(xin::Vector) = sphere2cart!(xin, my_xout)

j = ForwardDiff.jacobian(sphere2cart)

This is definitely a hack, though, as it relies on knowledge of ForwardDiff's implementation that I would never expect end-users to be aware of.

More importantly, I benchmarked my above "hack" against the non-mutating method (the sphere2cart you defined above) and I didn't see a significant runtime performance increase, only a slight drop in memory usage from skipping the allocation of a result vector. In more complex functions (esp. ones with higher input dimensions), I would guess that any benefit from the slight memory decrease will get dwarfed by the general memory usage of differentiating the function.

from forwarddiff.jl.

jrevels avatar jrevels commented on May 16, 2024

More importantly, I benchmarked my above "hack" against the non-mutating method (the sphere2cart you defined above) and I didn't see a significant runtime performance increase, only a slight drop in memory usage from skipping the allocation of a result vector.

For clarity, I was comparing taking the Jacobian of each, not the performance of normal execution of each (obviously the mutating method will be more performant if we're just talking about the functions themselves).

from forwarddiff.jl.

jrevels avatar jrevels commented on May 16, 2024

If memory serves, a pervious version of ForwardDiff supported functions that looked like this:

Unless I slipped up somewhere, ForwardDiff v0.1.0 should strictly support all the features of the previous version, plus a few more. There's even a deprecation wrapper that allows all of the old methods to still be used.

How were you doing this with the old version?

Well damn, you're right - you can find a reference to this feature here.

from forwarddiff.jl.

jrevels avatar jrevels commented on May 16, 2024

Adding back in support for this wouldn't be very hard, I think; here's the code in the old version that implements this.

@mlubin Do you have any thoughts on whether we should work to support this in the new version? I could change the current jacobian method to perform the "hack" I worked out above:

function jacobian{A}(f, ::Type{A}=Void;
                     mutates::Bool=false,
                     chunk_size::Int=default_chunk_size,
                     cache::ForwardDiffCache=ForwardDiffCache(),
                     output_length::Int=0)
    if output_length > 0 #if output_length > 0, assume f is of the form f!(y, x) where y is the output
        newf(x) = f(get_output!(cache, out, Val{chunk_size}, eltype(x)), x)
    else
        newf = f
    end

    if mutates
        function j!(output::Matrix, x::Vector)
            return ForwardDiff.jacobian!(output, newf, x, A;
                                         chunk_size=chunk_size,
                                         cache=cache)
        end
        return j!
    else
        function j(x::Vector)
            return ForwardDiff.jacobian(newf, x, A;
                                        chunk_size=chunk_size,
                                        cache=cache)
        end
        return j
    end
end

Then all that would need to be implemented is get_output! for the cache, which should be really easy to write. I'm just hesitant to add it as a feature.

from forwarddiff.jl.

dbeach24 avatar dbeach24 commented on May 16, 2024

Thanks for the very thorough follow-up! I'm glad that I'm not misremembering that an early version that supported mutator functions. (And thanks for the better issue name, too!)

Yes, my goal is to write a single version of the function which is both fast (i.e. uses result placement for evaluation of the function itself), and which also is supported by ForwardDiff. As a bonus, it would be nice if the function definition was generic and did not rely on any specific types defined in ForwardDiff. (I had believed that using the unspecialized Vector type was sufficient for this.

The problem with definitions like this is the use of global values for the result:

# The output vector needs to be able to store ForwardDiff's special
# overloaded number type
const my_xout = Vector{ForwardDiff.GradNumTup{3,Float64}}(3)

sphere2cart(xin::Vector) = sphere2cart!(xin, my_xout)

j = ForwardDiff.jacobian(sphere2cart)

If the caller fails to immediately copy the result, new evaluations will overwrite the value of earlier ones. Moreover, if & when julia gains multi-threading support, global result storage would violate thread-safety.

For clarity, I was comparing taking the Jacobian of each, not the performance of normal execution of each (obviously the mutating method will be more performant if we're just talking about the functions themselves).

It appears that you're making the argument that the complexity of computing the jacobian via dual numbers significantly outweighs the overhead of dynamic allocation, such that returning a newly allocated vector is not a concern. I have no doubt that this could be true for high-dimensional vector-valued functions. However, my interests are in being able to simultaneously compute the value and jacobian of a small dynamic state function where N=3..6. At these sizes, is it still the case that dynamic allocation costs are dwarfed by the computational complexity of the jacobian, itself? You might be right, I'm just not sure.

Sorry if the tone sounds at all argumentative here... Thanks again for the help!

from forwarddiff.jl.

jrevels avatar jrevels commented on May 16, 2024

Yes, my goal is to write a single version of the function which is both fast (i.e. uses result placement for evaluation of the function itself), and which also is supported by ForwardDiff.

A reasonable goal, I think!

As a bonus, it would be nice if the function definition was generic and did not rely on any specific types defined in ForwardDiff. (I had believed that using the unspecialized Vector type was sufficient for this.

It is, generally, and the type annotations in the example you gave are totally fine (though if you had typed them too specifically, e.g. Vector{Float64}, it would haven't worked - probably should note that in the documentation). Mutation of an input vector within the function is also totally allowed; the only tricky thing here was that a vector that ForwardDiff didn't "know" to overload was getting mutated.

If the caller fails to immediately copy the result, new evaluations will overwrite the value of earlier ones.

The usefulness of having a mutating function in the first place, though, is that you can reuse the same output vector for multiple evaluations. Being able to overwrite values once you're done with the result, instead of having to allocate new space for them, is precisely the reason why you'd use a mutating function instead of a non-mutating one.

Moreover, if & when julia gains multi-threading support, global result storage would violate thread-safety.

ForwardDiff.jl isn't yet designed with thread-safety in mind - there are a host of other issues that would have to be resolved in Base before we would even know just where to begin. ForwardDiff's internal caching layer is also not "global" state (though the user can optionally pass in their own cache, so they could make it so). Restrictions could be added in the future to make it thread-safe.

However, my interests are in being able to simultaneously compute the value and jacobian of a small dynamic state function where N=3..6. At these sizes, is it still the case that dynamic allocation costs are dwarfed by the computational complexity of the jacobian, itself?

I suspect that the lower the input dimension, the more the answer to that question will rely on the specific function being evaluated. Anyway, you should check out using AllResults if you want to grab the value and the Jacobian simultaneously.

Sorry if the tone sounds at all argumentative here...

It doesn't - on the contrary, I'm really glad you brought this up! I tried not to lose any features from previous versions during the v0.1.0 refactoring, but it appears this particular feature slipped through the cracks.

I believe I can re-implement this functionality pretty easily in the current version of ForwardDiff (my comment above basically does it, I just need to add some support in the caching layer), so I'm going to submit a PR soon for it. Stay tuned.

from forwarddiff.jl.

dbeach24 avatar dbeach24 commented on May 16, 2024

Thanks. Regarding this remark:

The usefulness of having a mutating function in the first place, though, is that you can reuse the same output vector for multiple evaluations. Being able to overwrite values once you're done with the result, instead of having to allocate new space for them, is precisely the reason why you'd use a mutating function instead of a non-mutating one.

Yes, I agree. With a mutator/placement approach the caller gets total control regarding if & when former results are overwritten with new data. With a global return vector, this is not the case.

Anyway, you should check out using AllResults if you want to grab the value and the Jacobian simultaneously.

Yeah, I saw this in the docs and it looks like just the thing!

I believe I can re-implement this functionality pretty easily in the current version of ForwardDiff (my comment above basically does it, I just need to add some support in the caching layer), so I'm going to submit a PR soon for it. Stay tuned.

I am eager to try whatever solution you come up with. Thanks again.

from forwarddiff.jl.

dbeach24 avatar dbeach24 commented on May 16, 2024

Oh wow -- Just realized who I've been chatting with. It was great having beers with you at JuliaCon. Hope to see you again next year!

from forwarddiff.jl.

mlubin avatar mlubin commented on May 16, 2024

This is an important feature to support, didn't realize it disappeared.

from forwarddiff.jl.

jrevels avatar jrevels commented on May 16, 2024

@dbeach24 I thought it was you, but I wasn't absolutely sure - it's obvious now that you have a profile pic! Ditto on the greatness of JuliaCon beers.

from forwarddiff.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.