Giter Club home page Giter Club logo

wavefunction91 / gauxc Goto Github PK

View Code? Open in Web Editor NEW
25.0 8.0 17.0 8.94 MB

GauXC is a modern, modular C++ library for the evaluation of quantities related to the exchange-correlation (XC) energy (e.g. potential, etc) in the Gaussian basis set discretization of Kohn-Sham density function theory (KS-DFT) on heterogenous architectures.

License: Other

CMake 0.43% C++ 89.85% Cuda 5.28% Python 0.72% Shell 0.04% Makefile 0.05% C 3.63%
dft gpu integrator

gauxc's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gauxc's Issues

Exact versions of dependencies not specified

Because exact versions of the dependencies are not specified, changes in those repositories may cause problems in GauXC. For example: recent updates to gau2grid seem to have caused gau2grid.h to not be found in GauXC.

Implement superposition of atomic potentials guess

In J. Chem. Theory Comput. 15, 1593 (2019), I pointed out that the superposition of atomic potentials (SAP) initial guess can be easily formed on a Becke grid.

Later on, I tabulated more accurate potentials with fully numerical calculations. These potentials are available as a BSD licensed C++ implementation in
https://github.com/psi4/psi4/blob/master/psi4/src/psi4/libfock/sap.cc

The guess is evaluated similarly to an LDA Fock matrix; one just needs to form

$$ V_{\mu\nu}^{\text{SAP}}=\int\chi_{\mu}({\bf r})V^{\text{SAP}}(\boldsymbol{r})\chi_{\nu}({\bf r}){\rm d}^{3}r $$

where the pointwise potential is a sum over atomic potentials

$$ V^{\text{SAP}}(\boldsymbol{r})=-\sum_{A}\frac{Z_{A}}{r_{A}} $$

which also have a limited range of 40 bohr. The guess is then obtained by forming the Fock matrix

$$ {\bf F} = {\bf T} + {\bf V}^\text{SAP} $$

and diagonalizing it to find orbitals and eigenvalues.

It would also be possible to implement a superposition of atomic densities (SAD) guess in GauXC by computing the pointwise sum of densities, approximating the exchange potential with the LDA, and computing the Coulomb part with analogous machinery to SAP.

Density Inputs / Potential Outputs Should be Consistent and Flexible for RKS and UKS

As of #78, UKS adopts a canonical Pauli definition of the density input / VXC output. RKS currently takes $D_\alpha$, which is not consistent with the Pauli definition. Looking forward to GKS, the Pauli implementation is superior as all input densities are Hermitian (as is not the case in the spin-separated convention), while one can go either way for RKS/UKS. We should also support the latter to allow for simple integration with spin-separated codes (which are the norm).

  1. RKS needs to (optionally) accept $D_s$ as input - this could be resolved by a strong type template parameter toggling the expected input. Luckily, the potential remains the same for either spin-separated or Pauli definitions of the density / potential.
  2. UKS needs to (optionally) accept $D_\alpha$ / $D_\beta$ as input and return $V_\alpha$ / $V_\beta$ as output. This does change the expected output by $\pm$ factors when this option would be enabled

Coulomb potential evaluator

Many atomic-orbital codes such as ADF and FHI-aims solve Poisson's equation numerically with the scheme of Becke and Dickson from J. Chem. Phys. 89, 2993–2997 (1988); see J. Chem. Theory Comput. 10, 1994 (2014) and Comput. Phys. Commun. 190, 33 (2015), respectively.

In this scheme, one uses an atomic partitioning (like in the polyatomic quadrature) to decompose the density into atomic fragments, and then computes the Coulomb potential using the Laplace decomposition:

  1. for each atom, define a radial spline basis and a maximum angular momentum
  2. for each radial shell, compute the projections onto $Y_{lm}$
  3. compute the potential $V_{lm}(r)$

The total Coulomb potential in each molecular quadrature point is obtained by summing over the atomic potentials. The Coulomb matrix is then obtained by computing the LDA-type expression.

Because this method is independent of the atomic basis, the most logical place to implement it is within GauXC. Although it is typically used with Slater-type orbitals or numerical atomic orbitals, it might be useful also for Gaussian-type orbitals.

Add downstream integration tests

As was made apparent in #103, sometimes we need to make large changes, but checking that those changes down break things down stream is a bit challenging. We should add unit tests to make sure that we can (at least) compile GauXC PRs relative to downstream projects to alert of potential problems without loud "knocking" every time we do something sweeping.

The first thing would be to define a set of downstream projects and to document them. Right now, the primary downstream integrations we need to be concerned about are

Happy to have suggestions for how we might want to go about this.

Value for `H` radius does not agree with other codes

We take the value of the H radius (0.25 pm) from Slater, J, . J. Chem. Phys. 41, 3199, 1964. Pyscf and NWChem use 0.35 pm and ChronusQ uses the Bohr radius (0.529 pm, note factor of 2 in source). I have no strong opinion for what is "correct" here, but we need to expose the ability to make this flexible to ensure agreements between codes.

N.B. This only affects MHL for the time being

Thanks @ajaypanyala, @dmejiar, @elambros, and @aodongliu for pointing this out

[CUDA] Need a robust check for MPI <-> device mapping in CUDA

The check introduced in 7ba2f43 and reverted in 94a1f86 is not robust. Fails for the following resource configuration on Summit https://jsrunvisualizer.olcf.ornl.gov/?s1f0o11n6c7g1r11d1b27l0=

Shared memory MPI instances is not mutually exclusive from each MPI rank acknowledging a single (unique) device.

Current workaround is to ensure proper device mapping prior to integrator call based on known MPI <-> device affinity (i.e. replicate this according to a known affinity)

[CMake] Export build tree

@wavefunction91 Thank you for addressing this! After changing to this commit, I'm getting a CMake issue with calling export on one of my targets that depends on gauxc. The error message says gauxc is not in any export set. After adding the below to src/CMakeLists.txt I was able to compile:

export(EXPORT gauxc-targets
      NAMESPACE GauXC::
      FILE "${PROJECT_BINARY_DIR}/GauXCTargets.cmake")

Originally posted by @samslattery in #31 (comment)

SCF Convergence Issues in Psi4-GauXC Interface with High AM

Hello! In working on my GauXC interface to Psi4, I have been having SCF convergence issues with system/basis set combinations where f shells are introduced into the basis set. My particular study case for this problem is benzene/6-311G(2df, p), for which I have left the Psi4 input here:

molecule mol {
  0 1
  H   1.0131690738465817   -1.5835021154162925   -1.7978401310386911
  H   -1.0131690738465817   1.5835021154162925   -5.012160365319985
  H   -1.8164621323959855   -0.7357020536229176   -4.931121359413318
  H   1.8164621323959855   0.7357020536229176   -1.8788791369453572
  H   -0.7729940563410085   -2.3568841717855844   -3.3212372420741265
  H   0.7729940563410085   2.3568841717855844   -3.4887632542845504
  C   0.5690300414747392   -0.9024360657756142   -2.502675182411813
  C   -0.5690300414747392   0.9024360657756142   -4.307325313946864
  C   -1.0176030741697608   -0.4210740306907108   -4.263060310720532
  C   1.0176030741697608   0.4210740306907108   -2.546940185638147
  C   -0.44857303269502175   -1.3122060956424129   -3.358011244754464
  C   0.44857303269502175   1.3122060956424129   -3.4519892516042137
  symmetry c1
  no_reorient
  no_com
}

set {
    scf_type dfdirj+snlink
    snlink_radial_points 35
    snlink_spherical_points 110
    guess sad
    df_scf_guess false
    basis 6-311g(2df, p)
    screening schwarz
    incfock false
}

energy = energy('scf')

Here are some extra details from my studies:

  • Inputting a modified version of the 6-311G(2df, p) basis set and manually removing the f functions leads the calculation to converging to the correct answer.
  • Normally, the 6-311G(2df, p) basis set is Spherical, and using it with benzene leads to oscillatory convergence as it gets closer to the convergence point, which is why it fails. Trying density matrix damping with a value of 25% does not resolve this issue.
  • Manually inputting the basis set and relabeling it as Cartesian leads to more severe SCF convergence issues, eventually causing the Psi4 DIIS to throw an exception entirely.

nprim_pair_max is too small

The parameter nprim_pair_max is too small for many combinations of atoms and basis sets. For example, C/cc-pvqz, Ar/cc-pvdz, Fe/cc-pvdz will lead to the following error:

terminate called after throwing an instance of 'GauXC::generic_gauxc_exception'
  what():  Generic GauXC Exception (Too Many Primitive Pairs)
  File     /Users/meji656/Sources/exachem-dev/build/TAMM_External-prefix/src/TAMM_External-build/GauXC_External-prefix/src/GauXC_External/include/gauxc/shell_pair.hpp
  Function void GauXC::ShellPair<F>::generate(const_shell_ref, const_shell_ref) [with F = double; const_shell_ref = const GauXC::Shell<double>&]
  Line     70

Evaluation of density variables based on orbital coefficients

As far as I understand, GauXC currently operates by evaluating the electron density on the grid from the density matrix

$$ n({\bf r}) = \sum_{\mu \nu} P_{\mu \nu} \chi_\mu({\bf r}) \chi_\nu ({\bf r}) $$

This approach gets high FLOPs, since it can be formulated with efficient intermediates: first compute the matrix multiplication $p_\mu({\bf r}) = P_{\mu \nu} \chi_\nu({\bf r})$ and then get the electron density from $n({\bf r}) = \sum_\mu p_\mu ({\bf r}) \chi_\mu({\bf r})$ with $N_\text{AO}^2 N_\text{grid}$ operations.

However, when you have a large basis set and few occupied orbitals (extreme case is Perdew-Zunger self-interaction correction where you need to evaluate Fock matrices for single occupied orbitals), you can compute the density faster from

$$ n({\bf r}) = \sum_i f_i |\psi_i ({\bf r})|^2 = \sum_i f_i \left|\sum_\mu C_{\mu i} \chi_\mu({\bf r})\right|^2. $$

Evaluating the orbitals takes $N_\text{MO} N_\text{AO} N_\text{grid}$ effort, which is the rate determining step. We therefore save a factor of $N_\text{MO}/N_\text{AO}$ operations. One gets the speedup in a dense basis set, where sparsity is not significant.

Improve CMake Unit Tests

Currently, we only test the ability of GauXC to be built as a subproject and as an installed dependency with default options. With the addition of exported properties in #101, it would make sense to add additional tests to check valid combinations to ensure that these variables get populated properly.

Add logging capabilities

GauXC is a complicated project with many interrelated components. It would drastically improve debugging processes to include a logger to facilitate program introspection. This would also enable GauXC to notify users of program state / warnings in a matter that can be runtime configurable (i.e. not always blasting stdout).

spdlog comes to mind as it's been used elsewhere throughout the NWChemEx stack. Logging should also be made optional as to avoid the requirement of additional dependencies

UKS inputs require non-canonical scaling factors

Currently, the UKS API has the following calling signature

[EXC, VXC_S, VXC_Z] = INTEGRATOR(D_S, D_Z);

where $D_s = \frac{1}{2}(D_\alpha + D_\beta)$ and $D_z = \frac{1}{2}(D_\alpha - D_\beta)$, etc. Although these are the formally correct factors in the Pauli expansion

$$ D^{UKS} = D_s \otimes I_2 + D_z \otimes \sigma_z $$

it is more canonical to externally handle the additional factors of $\frac{1}{2}$ such that $D_s$ and $D_z$ carry the physical interpretations of the total- and spin-densities, respectively. We should also use this convention to avoid unnecessary confusion (HT @ajaypanyala)

Adaptive grids from DeMon

J. Chem. Phys. 121, 681–690 (2004) describes an adaptive grid algorithm based on converging the diagonal Fock matrix elements to the specified threshold. The number of radial points is chosen by an semiempirical scheme, while the number of angular points is determined by trial and error.

An older algorithm in J. Chem. Phys. 108, 3226–3234 (1998) does the same for overlap matrices. This is useful in some use cases, but doesn't afford sufficient accuracy for Fock matrices.

CMake Variable Naming Concerns

In light of discussion present in #101, it seems that there may be some concerns with the naming scheme used for certain GauXC variables within the CMake build system, which is relevant since there is intent to export some of these variables as properties within the context of a CMake build. The current scheme of GAUXC_ENABLE_X, as pointed out by @loriab, implies that the variable is editable by the end user.

Implement VV10 non-local correlation

The Vydrov-van Voorhis non-local correlation model is used in many density functionals. Since it is evaluated from the density on a quadrature grid and does not depend on the basis set, the logical place to put it would be in GauXC.

$$ E_\text{nlc} = \frac 1 2 \int n({\bf r}) K({\bf r}, {\bf r}') n({\bf r}') {\rm d}^3r {\rm d}^3r' $$

Also other types of non-local correlation models might be implemented in GauXC. On the solid state side, libvdwxc provides implementations for many non-local correlation functionals but their implementations assume a uniform grid like is used in plane-wave and pseudo-atomic orbital approaches.

flexible ordering

Background: for solid harmonics, Psi4 uses gaussian ordering rather than standard=cca ordering like most other open-source QC packages.

Again, this is a medium-term issue -- nothing that's blocking development.

Performance regression in CPU threads

Currently, race conditions are avoided in the CPU code via omp critical - this is too heavy handed and severely limits strong scaling. The CPU integrators for EXC/VXC/sn-K need to be made lock-free to address this issue.

Unify Internal Drivers for RKS and UKS

The UKS implementation introduced in #59 replicates most of the code used for the RKS integrator. While the user APIs need to be separated (as implemented), the internal implementation can be unified by checking for e.g.-nullptr / zero-dims in the passed data pointers.

Reduce Static `ShellPair` Memory Consumption

As was pointed out in #56, the default nprim_pair_max = 64 is too small for many applications. #57 added a temporary fix but should not be used in production GPU applications. ShellPair memory consumption must be made dynamical and the sn-K kernels need to be optimized for small primitive pair counts.

Remove automatic ShellPair computation in LoadBalancer

Currently, the LoadBalancer automatically computes the ShellPairCollection. Only sn-K needs the ShellPair's computed, and it's a rather memory intensive operation. The ShellPairCollection should be removed from LoadBalancer and refactored into something sn-K specific.

GauXC::XCIntegrator<MatrixType>::eval_exc_vxc gives different results with same input

When eval_exc_vxc is called for object of type GauXC::XCIntegrator multiple times (with same input density) different energy values are produced. For example, simply adding the below to file tests/xc_integrator.cxx will demonstrate the issue.

  auto [ EXC2, VXC2 ] = integrator.eval_exc_vxc( P );
  std::cout << "EXC2 = " << EXC2 << std::endl;
  auto [ EXC3, VXC3 ] = integrator.eval_exc_vxc( P );
  std::cout << "EXC3 = " << EXC3 << std::endl;

Output I got

EXC2 = -24.4746
EXC3 = -23.6536

C language must be enabled to discover GauXC

C language must be enabled in order to install GauXC, otherwise the following error occur:

`root@a400f371d43c:/SCF# cmake -GNinja -H. -Bbuild -DCMAKE_TOOLCHAIN_FILE="${toolchain_file}" -DCMAKE_INSTALL_PREFIX=${INSTALL_PATH}
-- Attempting to find installed simde
-- Found MPI: TRUE (found version "3.1") found components: CXX
-- simde installation found
-- Attempting to find installed gauxc
-- Found OpenMP: TRUE (found version "4.5")
-- Performing Test BLAS_LOWER_UNDERSCORE
CMake Error at /install/lib/cmake/gauxc/linalg-cmake-modules/util/CommonFunctions.cmake:156 (try_compile):
Unknown extension ".c" for file

/install/lib/cmake/gauxc/linalg-cmake-modules/util/func_check.c

try_compile() works only for enabled languages. Currently these are:

CXX NONE

See project() command to enable other languages.
Call Stack (most recent call first):
/install/lib/cmake/gauxc/linalg-cmake-modules/util/CommonFunctions.cmake:241 (check_function_exists_w_results)
/install/lib/cmake/gauxc/linalg-cmake-modules/FindBLAS.cmake:85 (check_fortran_functions_exist)
/usr/bin/cmake/share/cmake-3.21/Modules/CMakeFindDependencyMacro.cmake:47 (find_package)
/install/lib/cmake/gauxc/gauxc-config.cmake:27 (find_dependency)
build/_deps/cmaize-src/cmake/cmaize/package_managers/cmake/dependency/dependency_class.cmake:113 (find_package)
build/cmakepp/fxn_calls/_cpp_dependency_find_dependency_dependency_desc__cpp_0cncj_1694708641.cmake:1 (cpp_dependency_find_dependency_dependency_desc)
build/_deps/cmakepp_lang-src/cmake/cmakepp_lang/utilities/call_fxn.cmake:77 (include)
build/_deps/cmakepp_lang-src/cmake/cmakepp_lang/object/call.cmake:125 (cpp_call_fxn)
build/_deps/cmakepp_lang-src/cmake/cmakepp_lang/object/object.cmake:56 (_cpp_object_call)
build/cmakepp/classes/Dependency.cmake:40 (_cpp_object)
build/_deps/cmaize-src/cmake/cmaize/package_managers/cmake/cmake_package_manager.cmake:182 (Dependency)
build/cmakepp/fxn_calls/_cpp_cmakepackagemanager_find_installed_cmakepackagemanager_desc_packagespecification_args__cpp_zluai_1694708641.cmake:1 (cpp_cmakepackagemanager_find_installed_cmakepackagemanager_desc_packagespecification_args)
build/_deps/cmakepp_lang-src/cmake/cmakepp_lang/utilities/call_fxn.cmake:77 (include)
build/_deps/cmakepp_lang-src/cmake/cmakepp_lang/object/call.cmake:125 (cpp_call_fxn)
build/_deps/cmakepp_lang-src/cmake/cmakepp_lang/object/object.cmake:56 (_cpp_object_call)
build/cmakepp/classes/CMakePackageManager.cmake:40 (_cpp_object)
build/_deps/cmaize-src/cmake/cmaize/user_api/find_or_build_dependency.cmake:128 (CMakePackageManager)
build/_deps/cmaize-src/cmake/cmaize/user_api/find_or_build_dependency.cmake:77 (cmaize_find_or_build_dependency_cmake)
CMakeLists.txt:58 (cmaize_find_or_build_dependency)

CMake Error at /install/lib/cmake/gauxc/linalg-cmake-modules/util/CommonFunctions.cmake:156 (try_compile):
Unknown extension ".c" for file

/install/lib/cmake/gauxc/linalg-cmake-modules/util/func_check.c

try_compile() works only for enabled languages. Currently these are:

CXX NONE

See project() command to enable other languages.
......

-- Attempting to fetch and build gauxc
-- GauXC Enabling OpenMP
-- Found OpenMP: TRUE (found version "4.5")
-- GauXC Enabling MPI
-- Found MPI: TRUE (found version "3.1")
-- Performing Test BLAS_LOWER_UNDERSCORE
-- Performing Test BLAS_LOWER_UNDERSCORE -- not found
-- Performing Test BLAS_LOWER_NO_UNDERSCORE
-- Performing Test BLAS_LOWER_NO_UNDERSCORE -- not found
-- Performing Test BLAS_UPPER_UNDERSCORE
-- Performing Test BLAS_UPPER_UNDERSCORE -- not found
-- Performing Test BLAS_UPPER_NO_UNDERSCORE
-- Performing Test BLAS_UPPER_NO_UNDERSCORE -- not found
CMake Error at /usr/bin/cmake/share/cmake-3.21/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find BLAS (missing: BLAS_LINK_OK)
Call Stack (most recent call first):
/usr/bin/cmake/share/cmake-3.21/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
build/_deps/linalg-cmake-modules-src/FindBLAS.cmake:119 (find_package_handle_standard_args)
build/_deps/gauxc-src/src/xc_integrator/local_work_driver/host/CMakeLists.txt:8 (find_package)

-- Configuring incomplete, errors occurred!`

Evaluation of atomic overlap matrices

This might already be implemented, however... many applications require the evaluation of atomic overlap matrices

$$ S_{\mu \nu}^{A} = \int w_A({\bf r}) \chi_\mu ({\bf r}) \chi_\nu({\bf r}) {\rm d}^3 r $$

where $A$ is the atom and $\mu$ and $\nu$ are basis function indices.

An example use case is our generalized Pipek-Mezey orbital localization method, which replaces the original ill-defined Mulliken (or Löwdin) charges with a variety of mathematically well-defined partial charge estimates. It turns out that the localized orbitals are remarkably insensitive to the partial charge method, which can thereby be chosen by computational convenience, such as the Becke charges defined by the above overlap matrices

$$Q_{ij}^{A} = C_{\mu i} S_{\mu \nu} C_{\nu j}$$

We have also extended this method to forming generalized Pipek-Mezey Wannier functions

flexible dependency detection

Background: Most Psi4 builds (by devs, users, and packagers) detect pre-built installations of its dependency packages, rather than building them with FetchContent (hereafter FC), including gau2grid (g2g).

  • From email discussions, it sounds like gauxc includes a pre-generated g2g source and expects to build that through FC. It'd be handy to have the option to detect a prebuilt and to not have two g2g libraries running around.
  • Psi4 has no problems requiring CMake ~3.24 to use FetchContent(... FIND_PACKAGE) if that'll help.
  • I can change any CMake for the g2g repo if that'll help.

Implement grid adapted cut-plane method

The paper on numerical integration in FHI-aims, J. Comput. Phys. 228, 8367 (2009), describes a grid adapted cut-plane method which is apparently better than the octrees currently used by GauXC.

The obvious drawback (from the algorithmic point of view) of the octree method is that the coordinate axes are given a special role as the planes determining the three planes to cut the set S. Also tying the local origin to the center of mass of S does not necessarily result in even-sized subsets Sl . Both shortcomings are relatively easy to overcome by (1) using only a single plane to cut S but adapting the orientation of the plane to S and (2) adjusting the location of the plane so that the resulting partitions are even-sized. The details are given in Algorithm 3. In Step 6 we use the same criteria as in the case of the octree method. We note that the adapted cut-plane method is a variation of a method presented in the lecture notes by Kahan [39] and closely related to ‘‘Principal Direction Divisive Partitioning” algorithm used in data mining [40].

The best method we have obtained, the adapted cut-plane method, is rather close to the theoretical optimum for our test systems, indicating a good level of heuristic approach. The octree method suffers from the tendency to generate batches with very few points leading to inefficiency and has the drawback of unnecessarily replicating the geometry of the system.

Add a fast Laplacian evaluator

The mGGA implementation introduced in #84 evaluates the full hessian of the collocation matrix to evaluate the Laplacian. In principle, we should be able to hack gau2grid to produce the Laplacian directly (requiring less memory and fewer data transposes)

[SYCL] Benzene Fails with OpenCL Backend as of Beta10

Error message in the UTs

/home/dbwylbl/GauXC/tests/xc_integrator.cxx:95: FAILED:
due to unexpected exception with message:
  Provided range is out of integer limits. Pass `-fno-sycl-id-queries-fit-in-
  int' to disable range check. -30 (CL_INVALID_VALUE)

[CUDA] Handle CUDA Architectures

Currently, we don't pass or determine any specific flags for CUDA architectures. Simplest way to handle this is to bump CMake -> 3.18 and enforce a minimum compute capability via CMAKE_CUDA_ARCHITECTURES if not set. The use of FP64 atomicAdd should put the minimum compute capability at 60

Add support for AMD AOCL libraries

It would be beneficial to support the AMD AOCL LAPACK/BLAS libraries libblis.so and libflame.so, as opposed to using Intel's MKL for AMD CPUs. This will be especially beneficial for NERSC's Perlmutter, which will use AMD Milan CPUs.

If you want to develop on Cori GPU's A100 nodes, I have installed AOCL and it is available as a module:

$ module use /global/cfs/cdirs/mpccc/dwdoerf/cori-gpu/modulefiles
$ module load aocl

I've tried to simply specify them using the following cmake parameters, but if fails in test LAPACK:

cgpu20:GauXC$ cat build-aocl.sh
#!/bin/bash
#
#Currently Loaded Modulefiles:
#  1) esslurm         2) dgx/1.0         3) cmake/3.18.2    4) gcc/8.3.0       5) jdk/1.8.0_202   6) cuda/11.0.2     7) aocl/2.2
#

set -x
BUILD_DIR=install_aocl

#  -DOpenBLAS_INCLUDE_DIR=${AOCL_HOME}/include \
cmake -H. \
  -B${BUILD_DIR} \
  -DGAUXC_ENABLE_MPI=OFF \
  -DGAUXC_ENABLE_CUDA=ON \
  -DCMAKE_CUDA_ARCHITECTURES=80 \
  -DGAUXC_ENABLE_MAGMA=OFF \
  -DReferenceBLAS_LP64_LIBRARIES=${AOCL_HOME}/lib/libblis.so \
  -DReferenceLAPACK_LP64_LIBRARIES=${AOCL_HOME}/lib/libflame.so

cmake --build ./${BUILD_DIR}

cgpu20:GauXC$ ./build-aocl.sh 
+ BUILD_DIR=install_aocl
+ cmake -H. -Binstall_aocl -DGAUXC_ENABLE_MPI=OFF -DGAUXC_ENABLE_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES=80 -DGAUXC_ENABLE_MAGMA=OFF -DReferenceBLAS_LP64_LIBRARIES=/global/project/projectdirs/mpccc/dwdoerf/cori-gpu/aocl-linux-gcc/2.2/lib/libblis.so -DReferenceLAPACK_LP64_LIBRARIES=/global/project/projectdirs/mpccc/dwdoerf/cori-gpu/aocl-linux-gcc/2.2/lib/libflame.so
-- The C compiler identification is GNU 7.4.1
-- The CXX compiler identification is GNU 7.4.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The CUDA compiler identification is NVIDIA 11.0.194
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/common/software/osuse15_dgx/cuda/11.0.2/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- GauXC Disabling MPI
-- Could not find IntegratorXX... Building
-- INTEGRATORXX REPO = https://github.com/wavefunction91/IntegratorXX.git
-- INTEGRATORXX REV  = 02ca3b0884a0261d8125cc2ec8397dc3166f2147
-- Performing Test INTEGRATORXX_HAS_NO_MISSING_BRACES
-- Performing Test INTEGRATORXX_HAS_NO_MISSING_BRACES - Success
-- Could not find ExchCXX... Building
-- EXCHCXX REPO = https://github.com/wavefunction91/ExchCXX.git
-- EXCHCXX REV  = 8966c818fe7531f3e85c0d8bc600bc310a99ba65
-- Setting (unspecified) option CMAKE_BUILD_TYPE: Release
-- Setting (unspecified) option BUILD_SHARED_LIBS: OFF
-- Setting option BUILD_TESTING: OFF
-- Setting (unspecified) option BUILD_FPIC: ON
-- Setting (unspecified) option NAMESPACE_INSTALL_INCLUDEDIR: /
-- Setting (unspecified) option ENABLE_GENERIC: OFF
-- Setting (unspecified) option ENABLE_XHOST: ON
-- Performing Test CMAKE_C_FLAGS [-xHost] - Failed
-- Performing Test CMAKE_C_FLAGS [-march=native] - Failed
CMake Warning at install_aocl/_deps/libxc-src/cmake/psi4OptionsTools.cmake:62 (message):
  Option unfulfilled as none of -xHost;-march=native valid
Call Stack (most recent call first):
  install_aocl/_deps/libxc-src/cmake/psi4OptionsTools.cmake:105 (add_C_or_CXX_flags)
  install_aocl/_deps/libxc-src/cmake/psi4OptionsTools.cmake:136 (add_C_flags)
  install_aocl/_deps/libxc-src/cmake/psi4OptionsTools.cmake:154 (add_flags)
  install_aocl/_deps/libxc-src/CMakeLists.txt:25 (option_with_flags)


-- Performing Test CMAKE_CXX_FLAGS [-xHost] - Failed
-- Performing Test CMAKE_CXX_FLAGS [-march=native] - Failed
CMake Warning at install_aocl/_deps/libxc-src/cmake/psi4OptionsTools.cmake:62 (message):
  Option unfulfilled as none of -xHost;-march=native valid
Call Stack (most recent call first):
  install_aocl/_deps/libxc-src/cmake/psi4OptionsTools.cmake:114 (add_C_or_CXX_flags)
  install_aocl/_deps/libxc-src/cmake/psi4OptionsTools.cmake:139 (add_CXX_flags)
  install_aocl/_deps/libxc-src/cmake/psi4OptionsTools.cmake:154 (add_flags)
  install_aocl/_deps/libxc-src/CMakeLists.txt:25 (option_with_flags)


-- Setting (unspecified) option ENABLE_FORTRAN: OFF
-- Setting (unspecified) option DISABLE_VXC: OFF
-- Setting (unspecified) option DISABLE_FXC: OFF
-- Setting (unspecified) option DISABLE_KXC: ON
-- Setting (unspecified) option DISABLE_LXC: ON
-- Performing Test standard_math_library_linked_to_automatically
-- Performing Test standard_math_library_linked_to_automatically - Failed
-- Performing Test standard_math_library_linked_to_as_m
-- Performing Test standard_math_library_linked_to_as_m - Success
-- Performing Test HAVE_CBRT
-- Performing Test HAVE_CBRT - Success
-- Version: Full 5.0.0
-- SO Version: Full 9:0:0 Major 9
-- Found Perl: /usr/bin/perl (found version "5.26.1") 
-- Found CUDAToolkit: /usr/common/software/osuse15_dgx/cuda/11.0.2/include (found version "11.0.194") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- BLAS_LIBRARIES Not Given: Will Perform Search
-- Checking if OpenMP is GNU
-- Checking if OpenMP is GNU -- YES
-- Could NOT find IntelMKL (missing: IntelMKL_LIBRARIES IntelMKL_INCLUDE_DIR) 
-- Could NOT find IBMESSL (missing: IBMESSL_LIBRARIES IBMESSL_INCLUDE_DIR) 
-- Could NOT find BLIS (missing: BLIS_INCLUDE_DIR) 
-- Could NOT find OpenBLAS (missing: OpenBLAS_INCLUDE_DIR) 
-- Found ReferenceBLAS: /global/project/projectdirs/mpccc/dwdoerf/cori-gpu/aocl-linux-gcc/2.2/lib/libblis.so   
-- Performing Test BLAS_LOWER_UNDERSCORE
-- Performing Test BLAS_LOWER_UNDERSCORE -- found
-- Found BLAS: TRUE   
-- LAPACK_LIBRARIES Not Given: Checking for LAPACK in BLAS
-- Performing Test LAPACK_LOWER_UNDERSCORE
-- Performing Test LAPACK_LOWER_UNDERSCORE -- not found
-- Performing Test LAPACK_LOWER_NO_UNDERSCORE
-- Performing Test LAPACK_LOWER_NO_UNDERSCORE -- not found
-- Performing Test LAPACK_UPPER_UNDERSCORE
-- Performing Test LAPACK_UPPER_UNDERSCORE -- not found
-- Performing Test LAPACK_UPPER_NO_UNDERSCORE
-- Performing Test LAPACK_UPPER_NO_UNDERSCORE -- not found
-- BLAS Does Not Have A Full LAPACK Linker -- Performing Search
-- Found ReferenceLAPACK: /global/project/projectdirs/mpccc/dwdoerf/cori-gpu/aocl-linux-gcc/2.2/lib/libflame.so  found components: lp64 
-- Performing Test LAPACK_LOWER_UNDERSCORE
-- Performing Test LAPACK_LOWER_UNDERSCORE -- not found
-- Performing Test LAPACK_LOWER_NO_UNDERSCORE
-- Performing Test LAPACK_LOWER_NO_UNDERSCORE -- not found
-- Performing Test LAPACK_UPPER_UNDERSCORE
-- Performing Test LAPACK_UPPER_UNDERSCORE -- not found
-- Performing Test LAPACK_UPPER_NO_UNDERSCORE
-- Performing Test LAPACK_UPPER_NO_UNDERSCORE -- not found
CMake Error at /global/common/sw/cray/cnl7/haswell/cmake/cray-cnl7-haswell/gcc-8.3.0/cmake/3.18.2/iteh6ngn/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:165 (message):
  Could NOT find LAPACK (missing: LAPACK_LINK_OK)
Call Stack (most recent call first):
  /global/common/sw/cray/cnl7/haswell/cmake/cray-cnl7-haswell/gcc-8.3.0/cmake/3.18.2/iteh6ngn/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:458 (_FPHSA_FAILURE_MESSAGE)
  install_aocl/_deps/linalg-cmake-modules-src/FindLAPACK.cmake:153 (find_package_handle_standard_args)
  src/integrator/host/gauxc-host_integrator.cmake:1 (find_package)
  src/integrator/CMakeLists.txt:10 (include)


-- Configuring incomplete, errors occurred!
See also "/global/homes/d/dwdoerf/src/cori-dgx/GauXC/install_aocl/CMakeFiles/CMakeOutput.log".
See also "/global/homes/d/dwdoerf/src/cori-dgx/GauXC/install_aocl/CMakeFiles/CMakeError.log".
+ cmake --build ./install_aocl
gmake: *** No targets specified and no makefile found.  Stop.

Catch2 Tag for Fetching Content

I was making a clean build and it looks like that Catch2 has recently retired their master branch and moved to devel as the default. This was causing the configuration step to fail on my machine.

Specifying the v2.13.2 tag in the fetch content here seems to fix the problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.