Giter Club home page Giter Club logo

operatorlearning.jl's Introduction

OperatorLearning.jl

Stable Dev Build Status Build status codecov

A Package that provides Layers for the learning of (nonlinear) operators in order to solve parametric PDEs.

For now, this package contains the Fourier Neural Operator originally proposed by Li et al [1] as well as the DeepONet conceived by Lu et al [2].

I decided to implement this method in Julia because coding up a layer using PyTorch in Python is rather cumbersome in comparison and Julia as a whole simply runs at comparable or faster speed than Python.

The implementation of the layers is influenced heavily by the basic layers provided in the Flux.jl package.

Installation

Simply install by running in a REPL:

pkg> add OperatorLearning

Usage/Examples

Fourier Layer

The basic workflow is more or less in line with the layer architectures that Flux provides, i.e. you construct individual layers, chain them if desired and pass the inputs as arguments to the layers.

The Fourier Layer performs a linear transform as well as convolution (linear transform in fourier space), adds them and passes it through the activation. Additionally, higher Fourier modes are filtered out in the convolution path where you can specify the amount of modes to be kept.

The syntax for a single Fourier Layer is:

using OperatorLearning
using Flux

# Input = 101, Output = 101, Grid points = 100, Fourier modes = 16
# Activation: sigmoid (you need to import Flux in your Script to access the activations)
model = FourierLayer(101, 101, 100, 16, σ)

# Same as above, but perform strict convolution in Fourier Space
model = FourierLayer(101, 101, 100, 16, σ; bias_fourier=false)

To see a full implementation, check the Burgers equation example at examples/burgers_FNO.jl. Compared to the original implementation by Li et al. using PyTorch, this version written in Julia clocks in about 20 - 25% faster when running on a NVIDIA RTX A5000 GPU.

If you'd like to replicate the example, you need to get the dataset for learning the Burgers equation. You can get it here or alternatively use the provided scripts.

DeepONet

The DeepONet function basically sets up two separate Flux Chain structs and transforms the two input arrays into one via einsum/dot product.

You can either set up a "vanilla" DeepONet via the constructor function which sets up Dense layers for you or, if you feel fancy, pass two Chains directly to the function so you can use other architectures such as CNN or RNN as well. The former takes two tuples that describe each architecture. E.g. (32,64,72) sets up a DNN with 32 neurons in the first, 64 in the second and 72 in the last layer.

using OperatorLearning
using Flux

# Create a DeepONet with branch 32 -> 64 -> 72 and sigmoid activation
# and trunk 24 -> 64 -> 72 and tanh activation without biases
model = DeepONet((32,64,72), (24,64,72), σ, tanh; init_branch=Flux.glorot_normal, bias_trunk=false)

# Alternatively, set up your own nets altogether and pass them to DeepONet
branch = Chain(Dense(2,128),Dense(128,64),Dense(64,72))
trunk = Chain(Dense(1,24),Dense(24,72))
model = DeepONet(branch,trunk)

For usage, check the Burgers equation example at examples/burgers_DeepONet.jl.

License

MIT

ToDos

  • 1D Fourier Layer
  • 2D / 3D Fourier Layer
  • DeepONet
  • Physics informed Loss

Contributing

Contributions are always welcome! Please submit a PR if you'd like to participate in the project.

References

[1] Z. Li et al., „Fourier Neural Operator for Parametric Partial Differential Equations“, arXiv:2010.08895 [cs, math], May 2021

[2] L. Lu, P. Jin, and G. E. Karniadakis, „DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators“, arXiv:1910.03193 [cs, stat], Apr. 2020

operatorlearning.jl's People

Contributors

chrisrackauckas avatar dependabot[bot] avatar devmotion avatar github-actions[bot] avatar mcabbott avatar pzimbrod avatar ranocha avatar thazhemadam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

operatorlearning.jl's Issues

Implement DeepONet

Apart from the Fourier Neural Operator, the other prominent architecture for operator approximation is DeepONet. This should be implemented in a similar fashion as well.

Something like:

model = DeepONet(in_branch, in_trunk, out, latentspace, activation; biases...)

It won't make sense to distinguish between one singular layer and the entire architecture similar to FourierLayer. So this should be worked on after #22 is resolved.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Use `fft` and `rfft` as fit

#31 introduces Fourier Layer for higher dimensions. For now, rfft has been replaced by fft since it's simpler to implement.

However, rfft can save memory and computation time considerably (~2x), so it should be picked up again.

I think #28 plays a role here as well. If the input is complex, then rfft doesn't make sense. But since we're working with @generated anyway, maybe we can include a check where we can decide whether to use regular or real FFT depending on the input.

Precaching and FFT planning

Similar to the FastDense optimizations in SciML/DiffEqFlux.jl#671, this library can definitely benefit from having pre-cached versions of the operations since the neural networks are generally small. In addition, the plan_fft part could be cached and reused for subsequent calls. Given the amount of reuse, direct control of the planning could be helpful:

The flags argument is a bitwise-or of FFTW planner flags, defaulting to FFTW.ESTIMATE. e.g. passing FFTW.MEASURE or FFTW.PATIENT will instead spend several seconds (or more) benchmarking different possible FFT algorithms and picking the fastest one; see the FFTW manual for more information on planner flags. The optional timelimit argument specifies a rough upper bound on the allowed planning time, in seconds. Passing FFTW.MEASURE or FFTW.PATIENT may cause the input array A to be overwritten with zeros during plan creation.

Note that the precaching only removes allocations in cases with a single forward before reverse. A separate pointer bumping method would be necessary to precache a whole batch of test inputs, if multiple batches are used in one loss equation.

Proper handling of batches

As for now, the constructor for the Fourier Layer specifically requests the sample size of the problem:

https://github.com/pzimbrod/NeuralOperator.jl/blob/e3facfa214a74d82db4b9d4c5c35dc8d10eeaa7b/src/FourierLayer.jl#L58-L86

This is highly undesirable since it introduces a huge amount of training parameters and slows down training considerably. It also limits the applicability of the trained model.

A re-write is needed to overcome this. For now, this is a workaround in order to do the matrix multiplication properly:

https://github.com/pzimbrod/NeuralOperator.jl/blob/e3facfa214a74d82db4b9d4c5c35dc8d10eeaa7b/src/FourierLayer.jl#L100-L101
https://github.com/pzimbrod/NeuralOperator.jl/blob/e3facfa214a74d82db4b9d4c5c35dc8d10eeaa7b/src/FourierLayer.jl#L109-L111

Create a FNO wrapper

In order to assimilate FourierLayer to the yet to implement DeepONet, it would be nice to have a Fourier Neural Operator (FNO) constructor that creates the entire architecture.

Something like:

model = FNO(in, out, latentspace, layercount, grid, modes, activation)

Instead of having to make the entire chain yourself:

layer = FourierLayer(latentspace, latentspace, grid, modes, activation)

model = Chain(
Dense(in, latentspace, activation),
layer,
layer,
[...],
Dense(latentspace, out, activation)
)

This doesn't add functionality but helps unify the API of the package. FourierLayer should still stay accessible via API as is since it offers a lot more control per layer than constructing the entire architecture with all the same global arguments.

`Zygote` won't work with FFT Plans

When taking the jacobian of a sample FFT-Pipeline, Zygote complains about missing fields in the corresponding FFT Plan:

using Zygote, FFTW

n = rand(100);
f = plan_rfft(n);
fi = plan_irfft(rfft(n), length(n));
fi * (f * n);

jacobian(x -> fi * (f * x), n)
ERROR: type ScaledPlan has no field region
Stacktrace:
  [1] getproperty(x::AbstractFFTs.ScaledPlan{ComplexF64, FFTW.rFFTWPlan{ComplexF64, 1, false, 1, UnitRange{Int64}}, Float64}, f::Symbol)
    @ Base ./Base.jl:33
  [2] (::Zygote.var"#931#932"{AbstractFFTs.ScaledPlan{ComplexF64, FFTW.rFFTWPlan{ComplexF64, 1, false, 1, UnitRange{Int64}}, Float64}, Vector{ComplexF64}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/lib/array.jl:788
  [3] (::Zygote.var"#3479#back#933"{Zygote.var"#931#932"{AbstractFFTs.ScaledPlan{ComplexF64, FFTW.rFFTWPlan{ComplexF64, 1, false, 1, UnitRange{Int64}}, Float64}, Vector{ComplexF64}}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/ZygoteRules/OjfTt/src/adjoint.jl:59
  [4] Pullback
    @ ./REPL[6]:1 [inlined]
  [5] (::typeof((#1)))(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface2.jl:0
  [6] (::Zygote.var"#209#210"{Tuple{Tuple{Nothing}}, typeof((#1))})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/lib/lib.jl:203
  [7] (::Zygote.var"#1746#back#211"{Zygote.var"#209#210"{Tuple{Tuple{Nothing}}, typeof((#1))}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/ZygoteRules/OjfTt/src/adjoint.jl:59
  [8] Pullback
    @ ./operators.jl:938 [inlined]
  [9] (::typeof((ComposedFunction{typeof(Zygote._jvec), var"#1#2"}(Zygote._jvec, var"#1#2"()))))(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface2.jl:0
 [10] (::Zygote.var"#46#47"{typeof((ComposedFunction{typeof(Zygote._jvec), var"#1#2"}(Zygote._jvec, var"#1#2"())))})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface.jl:41
 [11] withjacobian(f::Function, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/lib/grad.jl:162
 [12] jacobian(f::Function, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/lib/grad.jl:140
 [13] top-level scope
    @ REPL[6]:1

However, doing the FFT in-place, the error vanishes:

jacobian(x -> irfft(rfft(x), length(x)), n)
([1.0 0.0  0.0 0.0; 0.0 1.0  0.0 6.187926798518962e-19;  ; 0.0 0.0  1.0 0.0; 0.0 5.204170427930421e-18  0.0 1.0],)

This is an issue since we're doing the FFT and its inverse a lot of times, so the speed gain from using pre-planned FFTs shoud be quite considerable.

Abysmal memory allocation during the backward pass

When running burgers.jl with commit c62d8de, Julia allocates massive amounts of memory to a point that Julia gets killed on a 128GB RAM machine.

This behavior is abnormal and likely somewhere an abundant amount of copies takes place. This is reproducible when running both on CPU and GPU.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.