trovemaster / trove Goto Github PK

View Code? Open in Web Editor NEW

11.0 11.0 8.0 93.93 MB

Theoretical ROVibrational Energies: A variational program for accurate nuclear motion calculations

License: GNU General Public License v3.0

Fortran 98.27% Makefile 0.11% C 1.45% Shell 0.05% Python 0.12%

trove's People

Contributors

Stargazers

Watchers

Forkers

rse-cambridge ahmed-f-alrefaie saturn-hex9 ijk34 thomasmellor ageorgou warshon

trove's Issues

Host documentation on Read the Docs

Probably by converting the manual sources.

Euler symmetry not implemented with MPI

Currently, the only parts of the Hamiltonian contraction (and I assume some other steps) which can be run using MPI are those corresponding to k-based symmetry, the symmetry used in the H2CO benchmark, for example. The CH4 benchmark fails due to its use of the Euler symmetry, which is not currently supported in the MPI implementation.

Specifically in the Hamiltonian contraction stage (approx line 7740 in perturbation.f90) the function used when euler symmetry is enabled, symm_mat_element_vector, is not MPI-ed while the k-based symmetry version symm_mat_element_vector_k is. Comparing the two functions suggests it would be a few hours worth of work to MPI-ify symm_mat_element_vector. The only other function I can see in perturbation.f90 which is only called when rotsym euler is present is PTrot_contracted_matelem_class.

Dependency issues in updated MPI makefile

The makefile of the updated MPI version (currently in merge-develop-mpi) has a couple of small issues:

Some redundant dependencies, such as for moltype.o
mpi_aux.o is not listed as a direct dependency of tran.o and perturbation.o, which makes a serial build fail (although building in multiple jobs seems to work)

Simplify input file structure

Especially for users who are unfamiliar. Some ideas:

Clarify ascii vs non-ascii
Have more important options near the start

Record version of code and job parameters in output

If we're trying out different versions of the program and submitting several jobs, it may be useful to keep an automatic record of how the output was produced. This could mean, for example, recording the submission parameters (# of core etc), or the hash of the source commit that produced the executable (encoded within the executable or in the output files).

Implement linting in CI

It may be useful to contributors to have a github action checking the warning output from compilation.

OpenMP affinities have not been updated recently

With TROVE being run on many newer systems, it's likely that the OpenMP affinity pinning is out of date. This should be looked into to ensure we're getting best performance. A good resource on affinity is at https://pages.tacc.utexas.edu/~eijkhout/pcse/html/omp-affinity.html

Documentation updates

The documentation Section 2.2 talks about a 'training folder' that does not exist
Describes input file but doesn't mention makefile or how to run the executable (i.e. command line arguments)
Contents page could be hyperlinks to relevant section

Construction of supermatrix should use MPI

Even when its diagonalization is done through MPI (see #20), the supermatrix is constructed using OpenMP, which can(?) be problematic for large sizes.

Investigate order of combining classes when building contracted representation

It may be that there is a more efficient algorithm than the current approach.

Github actions should test code built using intel compiler

Currently github actions only compiles the code using gfortran and not with Intel's ifort. The binary from the compilers can differ substantially so the code would be best tested with both compilers. Although it's not entirely straightforward, Intel provides sample CI configurations enabling fairly easy use of the intel toolchain in github actions. This should be implemented.

Update build process

Simplify dependencies
Ideally have a single Makefile for parallel and non-parallel versions

Continuous Integration

Once we have some tests (#16), we can look into setting up a CI server to run some of them automatically on a push to GitHub.

Parallel diagonalizer has to be called as an external application

The parallel version of the diagonalizer, PDSYEV, is currently used when the supermatrix is too large for OpenMP to handle. However, it is built as a seperate application, and calling it requires stopping the TROVE workflow.

We should integrate the diagonalizer into this codebase, or use it as a library, so that users can take advantage of it implicitly while calling TROVE as usual.

Deprecated code in some files

Some files (perturbation.f90, fields.f90, possibly others) contain code that is no longer relevant, e.g. variants that are no longer under consideration. We could use some static analysis tools (or get some hints from the compiler output), but for some of it we will likely require more particular insight.

Intel MKL's function `ddoti` is not in standard BLAS

The issue

As far as I can tell ddoti is the only function in use (in main branch d2b00946dd308b666b901b5d6b17cc939574d734) which requires sparse BLAS. This is included in Intel MKL but is not included in the standard LAPACK/BLAS libraries installable via system packages managers.

Commenting out this function allows TROVE to build without Intel MKL and with standard, system BLAS.

Why is this an issue?

See #14

Possible solutions

Refactor out the use of ddoti
Implement ddoti
Find and add a version of sparse BLAS (here is a list of potential libraries)

Make fails with -j flag

Make fails on main branch (d2b0094) when using the -j<n> flag, a flag which tells make to parallelise its build process over n processors:

pot_H2O_Conway.f90(6): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [MOLTYPE]
  use moltype

Without the flag, the code builds correctly. This implies the dependencies listed in the makefile are not accurate.

TROVE should check if compiled with MPI when MPIIO enabled

Currently TROVE does not check if it was actually compiled with MPI when MPIIO is enabled in an input file. This should be checked early & TROVE should exit if MPIIO enabled but MPI was not.

Different file formats for MPI and non-MPI versions

The parallel version of the code writes results in the MPI format, whereas the sequential version uses the native Fortran format (which may vary between compilers?). This makes it difficult to share intermediate results between different runs, requiring the use of TROVE-MPI2FTN.

Ideally, both versions will use the same file format. One way may be to have parallel processes write to the same file (in native format), enforcing the right order, as is done in PDSYEV. Another could be to use HDF5, which appears to supports parallel I/O, and would be a more standard format generally.

Need a standardised way to instantiate the correct `ioHandler`

Currently, we're using the pattern

#ifdef TROVE_USE_MPI_
          allocate(ioHandler, &
            source=ioHandlerMPI(&
            job%kineteigen_file, err, &
            action='write', position='rewind', status='replace', form='unformatted'))
#else
          allocate(ioHandler, &
            source=ioHandlerFTN(&
            job%kineteigen_file, err, &
            action='write', position='rewind', status='replace', form='unformatted'))
#endif

to instantiate ioHandler, however this could be better wrapped in a function which allocates the correct handler.

Automated testing

There are several things that could be done to test the behaviour of the code more rigorously after changes. Whatever the details, we will need a framework that lets us compare the output for a particular example against expected outputs. This could mean comparing the binary checkpoints, or the text files, or possibly extracting the relevant parts from the output log.

It probably makes sense to consider two sets of tests:

Small (low-accuracy) tests that can be run locally, comparing with expected outputs
Larger examples that require HPC facilities

Standardise singular/plural folder names

TROVE
├── inputs
│   ├── CO2
│   │   └── g-tensors
│   └── H2O
├── manual
├── test
│   ├── benchmarks
│   │   └── H2CO
│   │   ├── input
│   │   └── outputs
│   ├── outputs
│   │   └── H2CO
│   └── scripts
│   └── H2CO

Would be much easier to remember relative paths if they are either all plural or all singular.

TROVE does not compile without Intel MKL

The issue

Intel MKL is currently a dependency of the latest (d2b00946dd308b666b901b5d6b17cc939574d734) version of TROVE due to the use of the ddoti function (see #13 ).

Why is this a problem?

@Trovemaster has expressed interest in ensuring TROVE is not dependent on Intel MKL, particularly in the event that TROVE is running on an AMD system (where Intel MKL appears to suffer performance issues).

trovemaster / trove Goto Github PK

trove's People

Contributors

Stargazers

Watchers

Forkers

trove's Issues

The issue

Why is this an issue?

Possible solutions

The issue

Why is this a problem?

Recommend Projects

Recommend Topics

Recommend Org