sciml / operatorlearning.jl Goto Github PK

View Code? Open in Web Editor NEW

43.0 5.0 8.0 441 KB

No need to train, he's a smooth operator

Home Page: https://operatorlearning.sciml.ai/dev

License: MIT License

Julia 100.00%

julia physics-informed-ml neural-operator deeponet

operatorlearning.jl's Introduction

OperatorLearning.jl

A Package that provides Layers for the learning of (nonlinear) operators in order to solve parametric PDEs.

For now, this package contains the Fourier Neural Operator originally proposed by Li et al [1] as well as the DeepONet conceived by Lu et al [2].

I decided to implement this method in Julia because coding up a layer using PyTorch in Python is rather cumbersome in comparison and Julia as a whole simply runs at comparable or faster speed than Python.

The implementation of the layers is influenced heavily by the basic layers provided in the Flux.jl package.

Installation

Simply install by running in a REPL:

pkg> add OperatorLearning

Usage/Examples

Fourier Layer

The basic workflow is more or less in line with the layer architectures that Flux provides, i.e. you construct individual layers, chain them if desired and pass the inputs as arguments to the layers.

The Fourier Layer performs a linear transform as well as convolution (linear transform in fourier space), adds them and passes it through the activation. Additionally, higher Fourier modes are filtered out in the convolution path where you can specify the amount of modes to be kept.

The syntax for a single Fourier Layer is:

using OperatorLearning
using Flux

# Input = 101, Output = 101, Grid points = 100, Fourier modes = 16
# Activation: sigmoid (you need to import Flux in your Script to access the activations)
model = FourierLayer(101, 101, 100, 16, σ)

# Same as above, but perform strict convolution in Fourier Space
model = FourierLayer(101, 101, 100, 16, σ; bias_fourier=false)

To see a full implementation, check the Burgers equation example at examples/burgers_FNO.jl. Compared to the original implementation by Li et al. using PyTorch, this version written in Julia clocks in about 20 - 25% faster when running on a NVIDIA RTX A5000 GPU.

If you'd like to replicate the example, you need to get the dataset for learning the Burgers equation. You can get it here or alternatively use the provided scripts.

DeepONet

The DeepONet function basically sets up two separate Flux Chain structs and transforms the two input arrays into one via einsum/dot product.

You can either set up a "vanilla" DeepONet via the constructor function which sets up Dense layers for you or, if you feel fancy, pass two Chains directly to the function so you can use other architectures such as CNN or RNN as well. The former takes two tuples that describe each architecture. E.g. (32,64,72) sets up a DNN with 32 neurons in the first, 64 in the second and 72 in the last layer.

using OperatorLearning
using Flux

# Create a DeepONet with branch 32 -> 64 -> 72 and sigmoid activation
# and trunk 24 -> 64 -> 72 and tanh activation without biases
model = DeepONet((32,64,72), (24,64,72), σ, tanh; init_branch=Flux.glorot_normal, bias_trunk=false)

# Alternatively, set up your own nets altogether and pass them to DeepONet
branch = Chain(Dense(2,128),Dense(128,64),Dense(64,72))
trunk = Chain(Dense(1,24),Dense(24,72))
model = DeepONet(branch,trunk)

For usage, check the Burgers equation example at examples/burgers_DeepONet.jl.

License

MIT

ToDos

1D Fourier Layer
2D / 3D Fourier Layer
DeepONet
Physics informed Loss

Contributing

Contributions are always welcome! Please submit a PR if you'd like to participate in the project.

References

[1] Z. Li et al., „Fourier Neural Operator for Parametric Partial Differential Equations“, arXiv:2010.08895 [cs, math], May 2021

[2] L. Lu, P. Jin, and G. E. Karniadakis, „DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators“, arXiv:1910.03193 [cs, stat], Apr. 2020

operatorlearning.jl's People

Contributors

Stargazers

Watchers

Forkers

jkosata abhishek-1bhatt isgasho mcabbott pzimbrod mauerflitzer zhucer2003 ranocha

operatorlearning.jl's Issues

Similar package at CliMA

Hey guys, we are developing a package at Caltech/CliMA here: https://github.com/CliMA/OperatorFlux.jl. Wanna work together and unify?

Flux.trainable - Handling trivial return case

Issue :
From the example for the FNO, collecting parameters throws an error as such.

Potential cause:
In https://github.com/SciML/OperatorLearning.jl/blob/master/src/FourierLayer.jl#L55-L56 and subsequently https://github.com/FluxML/Flux.jl/blob/master/src/utils.jl#L500-L502, there is trivial Bool return that FourierLayer.jl does not handle, therefore throwing an error.

Implement DeepONet

Apart from the Fourier Neural Operator, the other prominent architecture for operator approximation is DeepONet. This should be implemented in a similar fashion as well.

Something like:

model = DeepONet(in_branch, in_trunk, out, latentspace, activation; biases...)

It won't make sense to distinguish between one singular layer and the entire architecture similar to FourierLayer. So this should be worked on after #22 is resolved.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Register

@JuliaRegistrator register()

Use `fft` and `rfft` as fit

#31 introduces Fourier Layer for higher dimensions. For now, rfft has been replaced by fft since it's simpler to implement.

However, rfft can save memory and computation time considerably (~2x), so it should be picked up again.

I think #28 plays a role here as well. If the input is complex, then rfft doesn't make sense. But since we're working with @generated anyway, maybe we can include a check where we can decide whether to use regular or real FFT depending on the input.

Precaching and FFT planning

Similar to the FastDense optimizations in SciML/DiffEqFlux.jl#671, this library can definitely benefit from having pre-cached versions of the operations since the neural networks are generally small. In addition, the plan_fft part could be cached and reused for subsequent calls. Given the amount of reuse, direct control of the planning could be helpful:

The flags argument is a bitwise-or of FFTW planner flags, defaulting to FFTW.ESTIMATE. e.g. passing FFTW.MEASURE or FFTW.PATIENT will instead spend several seconds (or more) benchmarking different possible FFT algorithms and picking the fastest one; see the FFTW manual for more information on planner flags. The optional timelimit argument specifies a rough upper bound on the allowed planning time, in seconds. Passing FFTW.MEASURE or FFTW.PATIENT may cause the input array A to be overwritten with zeros during plan creation.

Note that the precaching only removes allocations in cases with a single forward before reverse. A separate pointer bumping method would be necessary to precache a whole batch of test inputs, if multiple batches are used in one loss equation.

Proper handling of batches

As for now, the constructor for the Fourier Layer specifically requests the sample size of the problem:

https://github.com/pzimbrod/NeuralOperator.jl/blob/e3facfa214a74d82db4b9d4c5c35dc8d10eeaa7b/src/FourierLayer.jl#L58-L86

This is highly undesirable since it introduces a huge amount of training parameters and slows down training considerably. It also limits the applicability of the trained model.

A re-write is needed to overcome this. For now, this is a workaround in order to do the matrix multiplication properly:

https://github.com/pzimbrod/NeuralOperator.jl/blob/e3facfa214a74d82db4b9d4c5c35dc8d10eeaa7b/src/FourierLayer.jl#L100-L101
https://github.com/pzimbrod/NeuralOperator.jl/blob/e3facfa214a74d82db4b9d4c5c35dc8d10eeaa7b/src/FourierLayer.jl#L109-L111

Complex compatibility and tests on the Schrodinger Equation?

It would be nice to check if this is compatible with Schrodinger equation applications, as that's a common PDE that can go to very high dimensions.

Create a FNO wrapper

In order to assimilate FourierLayer to the yet to implement DeepONet, it would be nice to have a Fourier Neural Operator (FNO) constructor that creates the entire architecture.

Something like:

model = FNO(in, out, latentspace, layercount, grid, modes, activation)

Instead of having to make the entire chain yourself:

layer = FourierLayer(latentspace, latentspace, grid, modes, activation)

model = Chain(
Dense(in, latentspace, activation),
layer,
layer,
[...],
Dense(latentspace, out, activation)
)

This doesn't add functionality but helps unify the API of the package. FourierLayer should still stay accessible via API as is since it offers a lot more control per layer than constructing the entire architecture with all the same global arguments.

Setup GPU tests

It needs to get on the BuildKite for GPU CI

`Zygote` won't work with FFT Plans

When taking the jacobian of a sample FFT-Pipeline, Zygote complains about missing fields in the corresponding FFT Plan:

using Zygote, FFTW

n = rand(100);
f = plan_rfft(n);
fi = plan_irfft(rfft(n), length(n));
fi * (f * n);

jacobian(x -> fi * (f * x), n)
ERROR: type ScaledPlan has no field region
Stacktrace:
  [1] getproperty(x::AbstractFFTs.ScaledPlan{ComplexF64, FFTW.rFFTWPlan{ComplexF64, 1, false, 1, UnitRange{Int64}}, Float64}, f::Symbol)
    @ Base ./Base.jl:33
  [2] (::Zygote.var"#931#932"{AbstractFFTs.ScaledPlan{ComplexF64, FFTW.rFFTWPlan{ComplexF64, 1, false, 1, UnitRange{Int64}}, Float64}, Vector{ComplexF64}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/lib/array.jl:788
  [3] (::Zygote.var"#3479#back#933"{Zygote.var"#931#932"{AbstractFFTs.ScaledPlan{ComplexF64, FFTW.rFFTWPlan{ComplexF64, 1, false, 1, UnitRange{Int64}}, Float64}, Vector{ComplexF64}}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/ZygoteRules/OjfTt/src/adjoint.jl:59
  [4] Pullback
    @ ./REPL[6]:1 [inlined]
  [5] (::typeof(∂(#1)))(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface2.jl:0
  [6] (::Zygote.var"#209#210"{Tuple{Tuple{Nothing}}, typeof(∂(#1))})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/lib/lib.jl:203
  [7] (::Zygote.var"#1746#back#211"{Zygote.var"#209#210"{Tuple{Tuple{Nothing}}, typeof(∂(#1))}})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/ZygoteRules/OjfTt/src/adjoint.jl:59
  [8] Pullback
    @ ./operators.jl:938 [inlined]
  [9] (::typeof(∂(ComposedFunction{typeof(Zygote._jvec), var"#1#2"}(Zygote._jvec, var"#1#2"()))))(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface2.jl:0
 [10] (::Zygote.var"#46#47"{typeof(∂(ComposedFunction{typeof(Zygote._jvec), var"#1#2"}(Zygote._jvec, var"#1#2"())))})(Δ::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface.jl:41
 [11] withjacobian(f::Function, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/lib/grad.jl:162
 [12] jacobian(f::Function, args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/TaBlo/src/lib/grad.jl:140
 [13] top-level scope
    @ REPL[6]:1

However, doing the FFT in-place, the error vanishes:

jacobian(x -> irfft(rfft(x), length(x)), n)
([1.0 0.0 … 0.0 0.0; 0.0 1.0 … 0.0 6.187926798518962e-19; … ; 0.0 0.0 … 1.0 0.0; 0.0 5.204170427930421e-18 … 0.0 1.0],)

This is an issue since we're doing the FFT and its inverse a lot of times, so the speed gain from using pre-planned FFTs shoud be quite considerable.