gridap / gridappetsc.jl Goto Github PK
View Code? Open in Web Editor NEWProvides PETSc solvers to the Gridap project
License: Other
Provides PETSc solvers to the Gridap project
License: Other
I think that it can be a good idea to rename
GridapPETSc.init!()
-> GridapPETSc.Init()
GridapPETSc.finalize!()
-> GridapPETSc.Finalize()
in order to be consistent with
MPI.Init()
MPI.Finalize()
Last time, I have suggested to use the versions with !
. But now, by looking into the example in the README file, I see clearly that we want to be consistent with MPI
. Sorry for the inconvenience...
Can you rename the functions? Thanks!
At present the current implementation of PETScNonLinearSolver is not fully optimized. These are the main problems.
Besides,
Thanks for the great library!
Would it be possible to add support for complex numbers?
The current implementation does not allow to use pc_fieldsplit
preconditioners of non-strided block matrices, i.e. ConsecutiveMultiFieldStyle
. I believe that to enable this feature we should have an implementation of the function PCFieldSplitSetIS
and the associated index set struc IS
.
Am I missing something? If not, how should we proceed?
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
I think it is highly recommended to include tests in Github Actions with the 64-bit integer compilation of PETSc.
AFAIK, PETSc_jll
artifacts only provide the 32-bit integer compilation.
Hi, I am operating on macOS Monterey, version 12.5.1, and i am facing an issue when i try to run code using GridapPETSc, below is the error in stack-trace
PETSc Error --- Application was linked against both OpenMPI and MPICH based MPI libraries
and will not run correctly
After i got this error, i have un-installed MPICH from my machine, and currently only have OPEN-MPI, but still it shows having linked with multiple MPI implementation. Any help in this regard will be very much appreciated ...
Hi @amartinhuertas , @fverdugo . I'm getting an assertion error here:
GridapPETSc.jl/src/PETScNonlinearSolvers.jl
Line 161 in 119a9d0
PETScNonlinearSolver
in transient problem. I get the error at the second time step, which makes me think that indeed op
and cache.op
are not the same. Should that be the case? If yes, what would be the best solution to this issue?@fverdugo ... test/sequential/ElasticityDriver.jl fails with PETSc compiled in DEBUG mode on my machine
The error is:
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Vector 0 must have 2-norm of 1.0, it is 27.
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.15.4, Sep 01, 2021
[0]PETSC ERROR: GridapPETSc on a x86_64 named sistemas-ThinkPad-X1-Carbon-6th by amartin Sat Nov 13 16:14:08 2021
[0]PETSC ERROR: Configure options --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 -with-blaslapack-dir=/opt/intel/compilers_and_libraries_2020.0.166/linux/mkl --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --with-debugging --with-x=0 --with-shared-libraries=1 --with-mpi=1 --with-64-bit-indices
[0]PETSC ERROR: #1 MatNullSpaceCreate() at /home/amartin/software_installers/petsc-3.15.4-build-dbg-gnu9-mpi/src/mat/interface/matnull.c:265
ElasticityDriver: Error During Test at /home/amartin/git-repos/GridapPETSc.jl/test/sequential/runtests.jl:19
Got exception outside of a @test
LoadError: Petsc returned with error code: 62
Stacktrace:
[1] macro expansion
@ ~/git-repos/GridapPETSc.jl/src/Config.jl:88 [inlined]
[2] (::Main.GridapPETScSequentialTests.ElasticityDriver.var"#1#9")()
@ Main.GridapPETScSequentialTests.ElasticityDriver ~/git-repos/GridapPETSc.jl/test/sequential/ElasticityDriver.jl:79
[3] with(f::Main.GridapPETScSequentialTests.ElasticityDriver.var"#1#9"; kwargs::Base.Iterators.Pairs{Symbol, Vector{SubString{String}}, Tuple{Symbol}, NamedTuple{(:args,), Tuple{Vector{SubString{String}}}}})
@ GridapPETSc ~/git-repos/GridapPETSc.jl/src/Environment.jl:38
[4] top-level scope
@ ~/git-repos/GridapPETSc.jl/test/sequential/ElasticityDriver.jl:50
[5] include(mod::Module, _path::String)
I tried to normalize the rigid body motion vectors. This error above disappears, but we get a new one. Namely:
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Vector 0 must have 2-norm of 1.0, it is 27.
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.15.4, Sep 01, 2021
[0]PETSC ERROR: GridapPETSc on a x86_64 named sistemas-ThinkPad-X1-Carbon-6th by amartin Sat Nov 13 16:14:08 2021
[0]PETSC ERROR: Configure options --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 -with-blaslapack-dir=/opt/intel/compilers_and_libraries_2020.0.166/linux/mkl --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --with-debugging --with-x=0 --with-shared-libraries=1 --with-mpi=1 --with-64-bit-indices
[0]PETSC ERROR: #1 MatNullSpaceCreate() at /home/amartin/software_installers/petsc-3.15.4-build-dbg-gnu9-mpi/src/mat/interface/matnull.c:265
ElasticityDriver: Error During Test at /home/amartin/git-repos/GridapPETSc.jl/test/sequential/runtests.jl:19
Got exception outside of a @test
LoadError: Petsc returned with error code: 62
Stacktrace:
[1] macro expansion
@ ~/git-repos/GridapPETSc.jl/src/Config.jl:88 [inlined]
[2] (::Main.GridapPETScSequentialTests.ElasticityDriver.var"#1#9")()
@ Main.GridapPETScSequentialTests.ElasticityDriver ~/git-repos/GridapPETSc.jl/test/sequential/ElasticityDriver.jl:79
[3] with(f::Main.GridapPETScSequentialTests.ElasticityDriver.var"#1#9"; kwargs::Base.Iterators.Pairs{Symbol, Vector{SubString{String}}, Tuple{Symbol}, NamedTuple{(:args,), Tuple{Vector{SubString{String}}}}})
@ GridapPETSc ~/git-repos/GridapPETSc.jl/src/Environment.jl:38
[4] top-level scope
@ ~/git-repos/GridapPETSc.jl/test/sequential/ElasticityDriver.jl:50
[5] include(mod::Module, _path::String)
@ Base ./Base.jl:386
[6] include
@ ~/git-repos/GridapPETSc.jl/test/sequential/runtests.jl:1 [inlined]
does it ring a bell? The rigid body motions seem to be well defined in the driver (3xrotations + 3xtranslations).
At present we are not able to reproduce this error in Github actions as we are using PETSc_jll which comes with PETSc compiled in release mode.
In GridapPETSc v0.3.0
the API no longer allows to extract the number of iterations at which ksp
converged.
Old API:
function PETSc_get_number_of_iterations(ps::PETScSolver)
KSPGetIterationNumber!(ps.ksp)
end
The ksp
is no longer an attribute of PETScSolver
, but PETScSolverNS
. How can the user extract the number of iterations now?
When installing GridapPETSc with external libraries, i.e. with non-empty JULIA_MPI_BINARY and JULIA_PETSC_LIBRARY, executing the whole set of tests fails even if separate execution of serial and parallel tests succeeds.
In other words,
julia --project=. --color=yes --check-bounds=yes test/sequential/runtests.jl
julia --project=. --color=yes --check-bounds=yes test/mpi/runtests.jl
are both successful but
julia --project=. --color=yes --check-bounds=yes test/runtests.jl
fails to exectute the parallel tests with a message
...
PartitionedArrays: Error During Test at /home/jprincipe/Prog/GridapPETSc.jl/test/mpi/runtests.jl:3
Got exception outside of a @test
LoadError: failed process: Process(`mpiexec -n 3 --allow-run-as-root --oversubscribe /home/jprincipe/System/bin/julia-1.7.2/bin/julia -Cnative -J/home/jprincipe/System/bin/julia-1.7.2/lib/julia/sys.so --check-bounds=yes -g1 --color=yes --project=/home/jprincipe/Prog/GridapPETSc.jl/test/mpi/../.. /home/jprincipe/Prog/GridapPETSc.jl/test/mpi/PartitionedArraysTests.jl`, ProcessExited(1)) [1]
...
The execution of the whole set of test is successful when the default installation is used, that is with empty JULIA_MPI_BINARY and JULIA_PETSC_LIBRARY.
@amartinhuertas , @fverdugo any thoughts?
PetscNonlinearSolver
. Issue: #45 @amartinhuertas #53copy!
. Issue #50 @amartinhuertas #53It would be nice to have all functionality working in native sequential (just as before), as well as in distributed sequential and distributed mpi mode.
During the recent upgrade to PartitionedArrays.jl
v0.3, I've had to limit the compatibility of the julia binaries to 0.13 and 0.15.
With any binary versions above that, errors occur (see for instance).
I am not currently 100% sure of what is causing the issue, but it is probably related to changes to the PETSc library. Hopefully it only involves a change in the tests, but it could be deeper. We should have a look at what is causing the issue and restore compatibility.
I have the feeling that the definition of these two functions:
GridapPETSc.jl/src/PETScArrays.jl
Line 121 in 3ba0de4
GridapPETSc.jl/src/PETScArrays.jl
Line 136 in 3ba0de4
is inconsistent with the Julia definition of copy!
:
help?> copy!
search: copy! copyto! circcopy! unsafe_copyto! copy copysign deepcopy
copy!(dst, src) -> dst
In-place copy of src into dst, discarding any pre-existing elements in dst.
If dst and src are of the same type, dst == src should hold after the call.
If dst and src are multidimensional arrays, they must have equal axes. See
also copyto!.
In particular, with the sentence "if dst and src are multidimensional arrays, they must have equal axes. " As far as I can tell this means that length(dst) == length(src)
should be true. And I think that this is not true in the functions linked above. In other words, both vectors need to be understood either as global vectors, or as local ones.
@amartinhuertas Do you remember why you needed to change these functions?
Dear PETSc users,
What is the main reason underlying PetscDestroy subroutines having global collective semantics? Is this actually true for all > PETSc objects? Can this be relaxed/deactivated by, e.g., compilation macros/configuration options?
We are leveraging PETSc from Julia in a parallel distributed memory context (several MPI tasks running the Julia REPL each). Julia uses Garbage Collection (GC), and we would like to destroy the PETSc objects automatically when the GC decides so along the simulation. In this context, we cannot guarantee deterministic destruction on all MPI tasks as the GC decisions are local to each task, no global semantics guaranteed.
Thanks in advance!
Best regards,
Alberto.
Ahh, this makes perfect sense.
The code for PetscObjectRegisterDestroy() and the actual destruction (called in PetscFinalize()) is very simply and can be found in src/sys/objects/destroy.c PetscObjectRegisterDestroy(), PetscObjectRegisterDestroyAll().
You could easily maintain a new array like PetscObjectRegisterGCDestroy_Objects[] and add objects with PetscObjectRegisterGCDestroy() and then destroy them with PetscObjectRegisterDestroyGCAll(). The only tricky part is that you have to have, in the context of your Julia MPI, make sure that PetscObjectRegisterDestroyGCAll() is called collectively over all the MPI ranks (that is it has to be called where all the ranks have made the same progress on MPI communication) that have registered objects to destroy, generally PETSC_COMM_ALL. We would be happy to incorporate such a system into the PETSc source with a merge request.
Barry
I think that it is not just MPI_Comm_free that is potentially problematic.
Here are some additional areas off the top of my head:
- PetscSF with -sf_type window. Destroy (when the refcount drops to zero) calls MPI_Win_free (which is collective over comm)
- Deallocation of MUMPS objects is tremendously collective.
In general the solution of just punting MPI_Comm_free to PetscFinalize (or some user-defined time) is, I think, insufficient since it requires us to audit the collectiveness of allXXX_Destroy
functions (including in third-party packages).
Barry's suggestion of resurrecting objects in finalisation using PetscObjectRegisterDestroy and then collectively clearing that array periodically is pretty close to the proposal that we cooked up I think.
Jack can correct any missteps I make in explanation, but perhaps this is helpful for Alberto:- Each PETSc communicator gets two new attributes "creation_index" [an int64], "resurrected_objects" [a set-like thing]
- PetscHeaderCreate grabs the next creation_index out of the input communicator and stashes it on the object. Since object >creation is collective this is guaranteed to agree on any given communicator across processes.
- When the Python garbage collector tries to destroy PETSc objects we resurrect the C object in finalisation and stash it in >"resurrected_objects" on the communicator.
- Periodically (as a result of user intervention in the first instance), we do garbage collection collectively on these resurrected >objects by performing a set intersection of the creation_indices across the communicator's processes, and then calling >XXXDestroy in order on the sorted_by_creation_index set intersection.
I think that most of this infrastructure is agnostic of the managed language, so Jack was doing implementation in PETSc (rather than petsc4py).
This wasn't a perfect solution (I recall that we could still cook up situations in which objects would not be collected), but it did >seem to (in theory) solve any potential deadlock issues.
Lawrence
Hi Everyone,
I cannot fault Lawrence's explanation, that is precisely what I'm implementing. The only difference is I was adding most of the logic for the "resurrected objects map" to petsc4py rather than PETSc. Given that this solution is truly Python agnostic, I will move what I have written to C and merely add the interface to the functionality to petsc4py.
Indeed, this works out better for me as I was not enjoying writing all the code in Cython! I'll post an update once there is a working prototype in my PETSc fork, and the code is ready for testing.
Cheers,
Jack
I think it would be convenient to store the MPI.Comm
object inside the PetscVector
and PetscMatrix
.
It would be required e.g. to check if comm==MPI.COMM_SELF
and select a suitable finalization strategy.
Once this is done, we can also remove MPI.Comm
from PetscLinearSolver
.
cc PR #75
Hello all,
@amartinhuertas and me have been working on a Geometric MultiGrid solver/preconditioner for Gridap. To solve the coarsest-level linear system in parallel, we decided to use MUMPS through a PETScLinearSolver
from GridapPETSc.
While profiling the resulting algorithm for a Poisson problem, where we use a preconditioned Conjugate Gradient solver with our GMG preconditioner, we find that a weirdly high amount of time is spent in the method gridap_petsc_gc()
, which is called from within solve!()
at every iteration of the conjugate gradient.
Due to this, we think it could be advantageous to introduce a new flag/kwarg/setting that deactivates all these automatic calls to the GC that are scattered around the code. The idea is to give an advanced user/developer the ability to manually call the GC optimally depending on the application.
Thanks!
cc @fverdugo
convert
API to transform from PartitionedArrays to PETSc and vice-versa. Done in PR #38GridapPETSc.with(...)
and PArrays.prun()
. See also #46 (comment)Sorry for these very naive questions, I don't know the inner workings of Gridap so they may be inappropriate.
For all transient problems in GridapODEs, the stage operator (linear or non-linear) is recreated at each time step. This causes issues in GridapPETSc when using a non-linear solver as described in Issue#81.
GridapPETSc.jl/src/PETScNonlinearSolvers.jl
Line 161 in 35b678d
The hack to comment out this line or set cache.op=op
does not work for all problems. @JordiManyer and I found the only solution was to set sysslvrcache=nothing
prior to solving the NonlinearStageOperator
within ode_march!
.
This is not a long term solution as destroying sysslvrcache
means the residual and jacobian are allocated every time step in
GridapPETSc.jl/src/PETScNonlinearSolvers.jl
Line 168 in 35b678d
An alternative is to use the non-linear solvers within GridapSolvers. Additionally, we tested using a PETSc linear solver within a GridapSolvers nonlinear solver and this recovered the correct behaviour. This isolates the problem to PETSc nonlinearsolvers, in particular:
GridapPETSc.jl/src/PETScNonlinearSolvers.jl
Line 156 in 35b678d
We believe this function should be changed to pass a new jacobian and residual function rather than cached objects for transient problems requiring a non-linear solver. @JordiManyer will try to create a reproducible example for testing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.