Giter Club home page Giter Club logo

llnl / hiop Goto Github PK

View Code? Open in Web Editor NEW
204.0 17.0 40.0 39.02 MB

HPC solver for nonlinear optimization problems

License: Other

CMake 1.69% C++ 93.29% MATLAB 0.05% C 1.59% Shell 1.11% Awk 0.02% Perl 0.01% Cuda 1.77% Fortran 0.48%
hpc nonlinear-optimization nonlinear-programming nonlinear-programming-algorithms interior-point-method parallel-programming mpi bfgs quasi-newton constrained-optimization solver optimization acopf gpu-support cuda math-physics radiuss interior-point-optimizer nonsmooth-optimization rocm

hiop's Introduction

HiOp - HPC solver for optimization

tests

HiOp is an optimization solver for solving certain mathematical optimization problems expressed as nonlinear programming problems. HiOp is a lightweight HPC solver that leverages application's existing data parallelism to parallelize the optimization iterations by using specialized parallel linear algebra kernels.

Please cite the user manual whenever HiOp is used:

@TECHREPORT{hiop_techrep,
  title={{HiOp} -- {U}ser {G}uide},
  author={Petra, Cosmin G. and Chiang, NaiYuan and Jingyi Wang},
  year={2018},
  institution = {Center for Applied Scientific Computing, Lawrence Livermore National Laboratory},
  number = {LLNL-SM-743591}
}

In addition, when using the quasi-Newton solver please cite:

@ARTICLE{Petra_18_hiopdecomp,
title = {A memory-distributed quasi-Newton solver for nonlinear programming problems with a small number of general constraints},
journal = {Journal of Parallel and Distributed Computing},
volume = {133},
pages = {337-348},
year = {2019},
issn = {0743-7315},
doi = {https://doi.org/10.1016/j.jpdc.2018.10.009},
url = {https://www.sciencedirect.com/science/article/pii/S0743731518307731},
author = {Cosmin G. Petra},
}

and when using the the PriDec solver please cite:

@article{wang2023,
  archivePrefix = {arXiv},
  author = {J. Wang and C. G. Petra},
  title = {A Sequential Quadratic Programming Algorithm for Nonsmooth Problems with Upper-$\mathcal{C}^2$ Objective},
  journal = {SIAM Journal on Optimization},
  volume = {33},
  number = {3},
  pages = {2379-2405},
  year = {2023},
  doi = {10.1137/22M1490995}
}
@INPROCEEDINGS{wang2021,
  author={J. Wang and N. Chiang and C. G. Petra},
  booktitle={2021 20th International Symposium on Parallel and Distributed Computing (ISPDC)}, 
  title={An asynchronous distributed-memory optimization solver for two-stage stochastic programming problems}, 
  year={2021},
  volume={},
  number={},
  pages={33-40},
  doi={10.1109/ISPDC52870.2021.9521613}}
 }

Build/install instructions

HiOp uses a CMake-based build system. A standard build can be done by invoking in the 'build' directory the following

$> cmake ..
$> make 
$> make test
$> make install

This sequence will build HiOp, run integrity and correctness tests, and install the headers and the library in the directory '_dist-default-build' in HiOp's root directory.

Command make test runs extensive tests of the various modules of HiOp to check integrity and correctness. The tests suite range from unit testing to solving concrete optimization problems and checking the performance of HiOp solvers on these problems against known solutions. By default make test runs mpirun locally, which may not work on some HPC machines. For these HiOp allows using bsub to schedule make test on the compute nodes; to enable this, the use should use -DHIOP_TEST_WITH_BSUB=ON with cmake when building and run make test in a bsub shell session, for example,

bsub -P your_proj_name -nnodes 1 -W 30
make test
CTRL+D

The installation can be customized using the standard CMake options. For example, one can provide an alternative installation directory for HiOp by using

$> cmake -DCMAKE_INSTALL_PREFIX=/usr/lib/hiop ..'

Selected HiOp-specific build options

  • Enable/disable MPI: -DHIOP_USE_MPI=[ON/OFF] (by default ON)
  • GPU support: -DHIOP_USE_GPU=ON. MPI can be either off or on. For more build system options related to GPUs, see "Dependencies" section below.
  • Enable/disable "developer mode" build that enforces more restrictive compiler rules and guidelines: -DHIOP_DEVELOPER_MODE=ON. This option is by default off.
  • Additional checks and self-diagnostics inside HiOp meant to detect abnormalities and help to detect bugs and/or troubleshoot problematic instances: -DHIOP_DEEPCHECKS=[ON/OFF] (by default ON). Disabling HIOP_DEEPCHECKS usually provides 30-40% execution speedup in HiOp. For full strength, it is recommended to use HIOP_DEEPCHECKS with debug builds. With non-debug builds, in particular the ones that disable the assert macro, HIOP_DEEPCHECKS does not perform all checks and, thus, may overlook potential issues.

For example:

$> cmake -DHIOP_USE_MPI=ON -DHIOP_DEEPCHECKS=ON ..
$> make 
$> make test
$> make install

Other useful options to use with CMake

  • -DCMAKE_BUILD_TYPE=Release will build the code with the optimization flags on
  • -DCMAKE_CXX_FLAGS="-O3" will enable a high level of compiler code optimization

Dependencies

A complete list of dependencies is maintained here.

For a minimal build, HiOp requires LAPACK and BLAS. These dependencies are automatically detected by the build system. MPI is optional and by default enabled. To disable use cmake option '-DHIOP_USE_MPI=OFF'.

HiOp has support for NVIDIA GPU-based computations via CUDA and Magma. To enable the use of GPUs, use cmake with '-DHIOP_USE_GPU=ON'. The build system will automatically search for CUDA Toolkit. For non-standard CUDA Toolkit installations, use '-DHIOP_CUDA_LIB_DIR=/path' and '-DHIOP_CUDA_INCLUDE_DIR=/path'. For "very" non-standard CUDA Toolkit installations, one can specify the directory of cuBlas libraries as well with '-DHIOP_CUBLAS_LIB_DIR=/path'.

Using RAJA and Umpire portability libraries

Portability libraries allow running HiOp's linear algebra either on host (CPU) or a device (GPU). RAJA and Umpire are disabled by default. You can turn them on together by passing -DHIOP_USE_RAJA=ON to CMake. If the two libraries are not automatically found, specify their installation directories like this:

$> cmake -DHIOP_USE_RAJA=ON -DRAJA_DIR=/path/to/raja/dir -Dumpire_DIR=/path/to/umpire/dir

If the GPU support is enabled, RAJA will run all HiOp linear algebra kernels on GPU, otherwise RAJA will run the kernels on CPU using an OpenMP execution policy.

Support for GPU computations

When GPU support is on, HiOp requires Magma linear solver library and CUDA Toolkit. Both are detected automatically in most cases. The typical cmake command to enable GPU support in HiOp is

$> cmake -DHIOP_USE_GPU=ON ..

When Magma is not detected, one can specify its location by passing -DHIOP_MAGMA_DIR=/path/to/magma/dir to cmake.

For custom CUDA Toolkit installations, the locations to the (missing/not found) CUDA libraries can be specified to cmake via -DNAME=/path/cuda/directory/lib, where NAME can be any of

CUDA_cublas_LIBRARY
CUDA_CUDART_LIBRARY
CUDA_cudadevrt_LIBRARY
CUDA_cusparse_LIBRARY
CUDA_cublasLt_LIBRARY
CUDA_nvblas_LIBRARY
CUDA_culibos_LIBRARY

Below is an example for specifiying cuBlas, cuBlasLt, and nvblas libraries, which were NOT_FOUND because of a non-standard CUDA Toolkit instalation:

$> cmake -DHIOP_USE_GPU=ON -DCUDA_cublas_LIBRARY=/usr/local/cuda-10.2/targets/x86_64-linux/lib/lib64 -DCUDA_cublasLt_LIBRARY=/export/home/petra1/work/installs/cuda10.2.89/targets/x86_64-linux/lib/ -DCUDA_nvblas_LIBRARY=/export/home/petra1/work/installs/cuda10.2.89/targets/x86_64-linux/lib/ .. && make -j && make install

A detailed example on how to compile HiOp straight of the box on summit.olcf.ornl.gov is available here.

RAJA and UMPIRE dependencies are usually detected by HiOp's cmake build system.

Kron reduction

Kron reduction functionality of HiOp is disabled by default. One can enable it by using

$> rm -rf *; cmake -DHIOP_WITH_KRON_REDUCTION=ON -DUMFPACK_DIR=/Users/petra1/work/installs/SuiteSparse-5.7.1 -DMETIS_DIR=/Users/petra1/work/installs/metis-4.0.3 .. && make -j && make install

Metis is usually detected automatically and needs not be specified under normal circumstances.

UMFPACK (part of SuiteSparse) and METIS need to be provided as shown above.

Interfacing with HiOp

HiOp supports three types of optimization problems, each with a separate input formats in the form of the C++ interfaces hiopInterfaceDenseConstraints,hiopInterfaceSparse and hiopInterfaceMDS. These interfaces are specified in hiopInterface.hpp and documented and discussed as well in the user manual.

hiopInterfaceDenseConstraints interface supports NLPs with billions of variables with and without bounds but only limited number (<100) of general, equality and inequality constraints. The underlying algorithm is a limited-memory quasi-Newton interior-point method and generally scales well computationally (but it may not algorithmically) on thousands of cores. This interface uses MPI for parallelization

hiopInterfaceSparse interface supports general sparse and large-scale NLPs. This functionality is similar to that of the state-of-the-art Ipopt (without being as robust and flexible as Ipopt is). Acceleration for this class of problems can be achieved via OpenMP or CUDA, however, this is work in progress and you are encouraged to contact HiOp's developers for up-to-date information.

hiopInterfaceMDS interface supports mixed dense-sparse NLPs and achives parallelization using GPUs and RAJA portability abstraction layer.

More information on the HiOp interfaces are here.

Running HiOp tests and applications

HiOp is using NVBlas library when built with CUDA support. If you don't specify location of the nvblas.conf configuration file, you may get an annoying warnings. HiOp provides default nvblas.conf file and installs it at the same location as HiOp libraries. To use it, set environment variable as

$ export NVBLAS_CONFIG_FILE=<hiop install dir>/lib/nvblas.conf

or, if you are using C-shell, as

$ setenv NVBLAS_CONFIG_FILE <hiop install dir>/lib/nvblas.conf

Existing issues

Users are highly encouraged to report any issues they found from using HiOp. One known issue is that there is some minor inconsistence between HiOp and linear package STRUMPACK. When STRUMPACK is compiled with MPI (and Scalapack), user must set flag HIOP_USE_MPI to ON when compiling HiOp. Otherwise HiOp won't load MPI module and will return an error when links to STRUMPACK, since the later one requires a valid MPI module. Similarly, if both Magma and STRUMPACK are linked to HiOp, user must guarantee the all the packages are compiled by the same CUDA compiler. User can check other issues and their existing status from https://github.com/LLNL/hiop

Acknowledgments

HiOp has been developed under the financial support of:

  • Department of Energy, Office of Advanced Scientific Computing Research (ASCR): Exascale Computing Program (ECP) and Applied Math Program.
  • Department of Energy, Advanced Research Projects Agency-Energy (ARPA‑E)
  • Lawrence Livermore National Laboratory Institutional Scientific Capability Portfolio (ISCP)
  • Lawrence Livermore National Laboratory, through the LDRD program

Contributors

HiOp is written by Cosmin G. Petra ([email protected]), Nai-Yuan Chiang ([email protected]), and Jingyi "Frank" Wang ([email protected]) from LLNL and has received important contributions from Asher Mancinelli (PNNL), Slaven Peles (ORNL), Cameron Rutherford (PNNL), Jake K. Ryan (PNNL), and Michel Schanen (ANL).

Copyright

Copyright (c) 2017-2021, Lawrence Livermore National Security, LLC. All rights reserved. Produced at the Lawrence Livermore National Laboratory. LLNL-CODE-742473. HiOp is free software; you can modify it and/or redistribute it under the terms of the BSD 3-clause license. See COPYRIGHT and LICENSE for complete copyright and license information.

hiop's People

Contributors

ashermancinelli avatar cameronrutherford avatar chapman39 avatar cnpetra avatar fritzgoebel avatar gitbytes avatar jaelynlitz avatar jdmacam avatar jwang125 avatar kswirydo avatar michel2323 avatar nkoukpaizan avatar nychiang avatar pate7 avatar pelesh avatar rothpc avatar ryandanehy avatar sayefsakin avatar tepperly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hiop's Issues

Is `hiopVectorPar::startingAtCopyFromStartingAt` correctly implemented?

It seems that local variable howManyToCopy and howManyToCopyDest are flipped around in function hiopVectorPar::startingAtCopyFromStartingAt.

  • howManyToCopy is the number of elements at the destination that will be overwritten.
  • howManyToCopyDest is number of source element that will be written to the destination.
    The assertion in this function will enforce:
  assert(howManyToCopy <= howManyToCopyDest);

what is, I think, opposite of what is intended (despite names of these variables suggesting otherwise).

Also, it seems the function arguments seem to have misleading meaning since start_idx_src is destination and start_idx_dest is source offset.

This (possible) bug does not affect the code, because only time this function is called the source and destination are of the same size and both offsets are zero.

@cnpetra

`addToSymDenseMatrixUpperTriangle` and `transAddToSymDenseMatrixUpperTriangle` for sparse matrix have never been used

functions addToSymDenseMatrixUpperTriangle and transAddToSymDenseMatrixUpperTriangle for sparse matrix have never been used.
The optimization routines under 'hiopKKTLinMDS' only need to use this function for its dense part (see here) , and we always assume the sparse part of HessMDS is a diagonal matrix (in order to perform its inverse function efficiently).

New functions addToSym**Sparse**MatrixUpperTriangle may be required for the sparse linear implementation, which is handled in #85. The RAJA variants may be implemented later.

How to select vector pattern

It might be helpful to specify and better document how to select vector pattern. Currently pattern is selected by a vector of 1.0s and 0.0s in double precision. Rules are a little vague as to what happens in case that pattern vector element is neither one nor zero. Possible solutions I see are:

  1. Strictly enforce pattern vectors have elements zero (not selected) and one (selected) only.
    • Pros: No changes to existing vector kernel implementations, provides more flexibility for future GPU implementation of vector kernels.
    • Cons: Potentially bug prone, need to add assertions (at least in debug mode) enforcing pattern vector values to be either one or zero.
  2. SUNDIALS way: zero means not selected, > zero means selected.
    • Pros: Safer than 1., no changes to existing vector kernel implementations needed.
    • Cons: Provides less flexibility for GPU implementations, undefined when value < 1.
  3. Define pattern vector as a vector of booleans.
    - Pros: Safe implementation
    - Cons: Less flexibility for GPU implementations, more invasive changes to the code needed.

The argument related to GPU implementations is something like this: If pattern vector elements are either one or zero, then instead of using conditionals, which could cause warp divergence, one could simply do elementwise multiply with pattern vector to select the pattern. For memory bound computations, this could lead to a better performing implementation, I think.

@cnpetra @ashermancinelli

User can't set verbosity level without re-compiling HIOP

Hey,
Just ran into this one.

In hiopNlpFormulation.cpp, you create a hiop logger with the verbosity level as a parameter. However, you are immediately grabbing the verbosity level option from the default constructed hiopOptions which prohibits anyone from actually specifying the verbosity level :

  options = new hiopOptions(/*filename=NULL*/);

  hiopOutVerbosity hov = (hiopOutVerbosity) options->GetInteger("verbosity_level");
  log = new hiopLogger(this, hov, stdout);

Cmake fails to find the installed lapack on Lassen

Cmake cannot find the installed Lapack on Lassen.
In configuration, it shows:
Found LAPACK libraries: /usr/lib64/libessl.so;/usr/lib64/libblas.so;
However, it gives error message with compiling the code:
hiopKKTLinSys.cpp:(.text+0x7c24): undefined reference to `dposvx_'

Same error happens when different version of Lapack is loaded by command "module".

Possible issue in Magma solver interface

There may be an issue in Magma no pivoting solver interface or even bug in Magma. The other possibility is that Example 4 cannot be solved when using no-pivoting linear solver. When switching between hybrid and cpu modes in Example 4, the number of iterations and the convergence rate changes.

It seems that Magma no-pivoting solver diverges. I am not sure though if cpu compute mode uses no-pivoting function or it always uses Bunch-Kauffmann, though.

Below is the Example 4 output for hybrid and cpu compute modes, respectively.

hybrid compute mode - Magma factorization and Lapack solve:

iter    objective     inf_pr     inf_du   lg(mu)  alpha_du   alpha_pr linesrch
   0  3.9800005e+02 4.990e+02  4.000e+00  -1.00  0.000e+00  0.000e+00  -(-)
[Warning] KKT_MDS_XYcYd linsys: MagmaNopiv size 503 (403 cons) (safe_mode=0)
   1  3.6824691e+02 5.536e+02  3.990e+00  -1.00  1.579e-03  2.372e-03  1(s)
   2  3.3431951e+02 5.126e+02  3.989e+00  -1.00  6.706e-05  4.510e-05  1(s)
   3  6.2429733e+02 4.067e+02  3.987e+00  -1.00  6.218e-05  6.111e-05  1(s)
   4  1.5031313e+03 3.637e+02  1.647e+01  -1.00  1.336e-04  3.048e-05  1(s)
[Warning] Requesting additional accuracy and stability from the KKT linear system at iteration 4 (safe mode ON)
[Warning] KKT_MDS_XYcYd linsys: MagmaBuKa size 503 (403 cons) (safe_mode=1)
   5  1.0674827e+02 1.112e+02  1.509e+01  -1.00  3.784e-01  6.944e-01  1(s)
   6  5.0647226e+00 5.952e+01  1.038e+01  -1.00  3.490e-01  4.646e-01  1(s)
   7  4.1846536e+00 5.900e+01  9.514e+00  -1.00  4.810e-02  8.682e-03  1(s)
   8  4.1624661e+00 5.893e+01  6.328e+00  -1.00  9.055e-03  1.154e-03  1(s)
   9 -4.9917037e+01 2.292e-01  1.064e+01  -1.00  6.468e-03  9.961e-01  1(s)
iter    objective     inf_pr     inf_du   lg(mu)  alpha_du   alpha_pr linesrch
  10 -4.9924417e+01 2.221e-01  2.031e+01  -1.00  7.325e-01  3.102e-02  1(s)
  11 -4.3510181e+01 1.070e-12  1.171e+00  -1.00  8.872e-01  1.000e+00  1(s)
  12 -4.3197637e+01 4.252e-14  1.000e-06  -1.00  1.000e+00  1.000e+00  1(f)
  13 -4.9686273e+01 1.054e-12  1.748e+00  -2.55  9.219e-01  1.000e+00  1(f)
  14 -4.9973780e+01 1.013e-13  2.828e-08  -2.55  1.000e+00  1.000e+00  1(f)
  15 -4.9992605e+01 3.642e-14  1.504e-09  -3.82  1.000e+00  1.000e+00  1(f)
  16 -4.9993471e+01 8.882e-16  1.504e-09  -3.82  1.000e+00  1.000e+00  1(f)
  17 -4.9993739e+01 2.442e-15  1.729e-03  -5.73  9.710e-01  1.000e+00  1(f)
  18 -4.9994734e+01 1.179e-13  1.845e-11  -5.73  1.000e+00  1.000e+00  1(f)
  19 -4.9994724e+01 2.887e-15  1.845e-11  -5.73  1.000e+00  1.000e+00  1(f)
iter    objective     inf_pr     inf_du   lg(mu)  alpha_du   alpha_pr linesrch
  20 -4.9994868e+01 6.661e-16  3.077e-04  -6.00  8.548e-01  1.000e+00  1(f)
  21 -4.9994888e+01 6.883e-14  1.000e-11  -6.00  1.000e+00  1.000e+00  1(f)
Successfull termination.
Total time 2.521 sec 
Hiop internal time:     total 2.515 sec     avg iter 0.120 sec 
    internal total std dev across ranks 0.000 percent
Fcn/deriv time:     total=0.004 sec  ( obj=0.000 grad=0.000 cons=0.001 Jac=0.002 Hess=0.001) 
    Fcn/deriv total std dev across ranks 0.000 percent
Fcn/deriv #: obj 56 grad 22 eq cons 57 ineq cons 57 eq Jac 22 ineq Jac 22
Total KKT time 2.506 sec 
        update init 1.725sec     update linsys 0.077 sec     fact 0.655 sec 
        solve rhs-manip 0.007 sec     triangular solve 0.042 sec 

cpu compute mode - Lapack factorization and solve

iter    objective     inf_pr     inf_du   lg(mu)  alpha_du   alpha_pr linesrch
   0  3.9800005e+02 4.990e+02  4.000e+00  -1.00  0.000e+00  0.000e+00  -(-)
   1  1.8397700e+01 1.818e+02  1.457e+00  -1.00  3.341e-01  6.357e-01  1(s)
   2 -4.3390041e+01 5.305e+01  4.253e-01  -1.00  7.654e-01  7.081e-01  1(s)
   3 -4.9598500e+01 6.175e-13  2.539e-01  -1.00  8.152e-01  1.000e+00  1(s)
   4 -4.9406538e+01 5.160e-13  3.078e+01  -1.00  9.388e-01  5.909e-02  1(f)
   5 -4.3566387e+01 3.950e-13  1.000e-06  -1.00  1.000e+00  1.000e+00  1(f)
   6 -4.9711480e+01 6.206e-13  1.601e+00  -2.55  9.261e-01  1.000e+00  1(f)
   7 -4.9974904e+01 1.774e-13  2.828e-08  -2.55  1.000e+00  1.000e+00  1(f)
   8 -4.9992701e+01 3.353e-14  1.504e-09  -3.82  1.000e+00  1.000e+00  1(f)
   9 -4.9993494e+01 1.643e-14  1.504e-09  -3.82  1.000e+00  1.000e+00  1(f)
iter    objective     inf_pr     inf_du   lg(mu)  alpha_du   alpha_pr linesrch
  10 -4.9993744e+01 1.310e-14  1.732e-03  -5.73  9.710e-01  1.000e+00  1(f)
  11 -4.9994740e+01 2.887e-15  1.845e-11  -5.73  1.000e+00  1.000e+00  1(f)
  12 -4.9994726e+01 1.377e-14  1.845e-11  -5.73  1.000e+00  1.000e+00  1(f)
  13 -4.9994868e+01 7.772e-15  3.029e-04  -6.00  8.509e-01  1.000e+00  1(f)
  14 -4.9994888e+01 2.941e-12  1.000e-11  -6.00  1.000e+00  1.000e+00  1(f)
Successfull termination.
Total time 0.131 sec 
Hiop internal time:     total 0.128 sec     avg iter 0.009 sec 
    internal total std dev across ranks 0.000 percent
Fcn/deriv time:     total=0.002 sec  ( obj=0.000 grad=0.000 cons=0.000 Jac=0.001 Hess=0.001) 
    Fcn/deriv total std dev across ranks 0.000 percent
Fcn/deriv #: obj 15 grad 15 eq cons 16 ineq cons 16 eq Jac 15 ineq Jac 15
Total KKT time 0.123 sec 
        update init 0.009sec     update linsys 0.049 sec     fact 0.059 sec 
        solve rhs-manip 0.004 sec     triangular solve 0.002 sec 

C/Fortran interface

A C/Fortran interface to this would be great. I think it could be added in a similar way that Ipopt has done it.

HiOp segfaults on nlpDenseCons_ex1 tests when built with default options

Building HiOp with default options on macOS 10.12.6 and running the tests yields errors for the nlpDenseCons_ex1 tests. Log files can be found at: https://gist.github.com/goxberry/8bdc80e6dcd4d15ed0a7c5130009d6aa

The configuration I'm using is built by spack, so I have some flexibility in choosing libraries, but all of these libraries are included via RPATH directives. My impression is that linking isn't an issue, but I could be wrong about that.

Do *SymDenseMatrixUpperTriangle methods work for MPI partitioned matrices?

It seems that *SymDenseMatrixUpperTriangle methods add elements of a rectangular matrix (pointed by this) into the upper triangular part of the matrix W (passed as the input argument). The methods use only local data indices and may not work for distributed memory partitioned matrices unless the caller provides row and column start indices that would guarantee data is written to the upper triangular part of W.

It would be good to better document preconditions for calling these functions as they are nontrivial. Are these methods intended for use when both, none, or only W are MPI partitioned?

hiopMatrixMDS copyFrom method will always fail

In hiopMatrixMDS::copyFrom, we have:

  virtual void copyFrom(const hiopMatrixMDS& m)
  {
    mSp->copyFrom(*m.mSp);
    mDe->copyFrom(*m.mDe);
  }

Yet, hiopMatrixSparseTriplet::copyFrom has:

void hiopMatrixSparseTriplet::copyFrom(const hiopMatrixSparseTriplet& dm)
{
  assert(false && "this is to be implemented - method def too vague for now");
}

Therefore this method will always fail. Should it not be a method of hiopMatrixMDS or should there be an implementation for hiopMatrixSparseTriplet::copyFrom that will not immediately assert?

Compiling HiOp on Mac (v 10.14.6)

I'm getting linker errors of the following type:
[ 68%] Linking CXX executable nlpDenseCons_ex3.exe
Undefined symbols for architecture x86_64:
"daxpy", referenced from:
hiop::hiopVectorPar::axpy(double, hiop::hiopVector const&) in libhiop.a(hiopVector.cpp.o)
hiop::hiopMatrixDense::addMatrix(double, hiop::hiopMatrix const&) in libhiop.a(hiopMatrix.cpp.o)

Openblas, Lapack, OpenMPI are installed.

This is the cmake output:
cmake_out.txt

Coding style to distinguish class member variables

It might be helpful to consider a coding style that makes class member variables distinguishable (e.g. ending each member variable name with _). IMHO it would significantly improve readability of the code, especially in functions that are a couple of hundreds lines long and operate on a few dozen variables.

Considering the size of HiOp code, making a wholesale changes would require some work, but even setting style guidelines for future contributions would be quite helpful.

Nlp mixed sparse-dense tests fail with GPU enabled

When building with MPI on and GPU off, all tests pass as expected. With MPI and GPU on however, all of the NlpMDS methods fail:

The following tests FAILED:
	 11 - NlpMixedDenseSparse4_1 (Failed)
	 12 - NlpMixedDenseSparse4_2 (Failed)
	 13 - NlpMixedDenseSparse5_1 (Failed)

crash with NDEBUG flag

in a couple of places (essential) code is placed inside 'assert(...)'. This causes undefined behavoir and/or crashes when hiop is compiled with NDEBUG. @junkudo

nlpMDS_ex5.exe fails on Broadwell/Volta100

The nlpMDS_ex5.exe test fails to converge on Intel platform with Volta GPU. HiOp is built with GPU support and Kron reduction enabled. When running with 1 MPI rank and on one GPU device, following error message is obtained:

...

[Warning] hiopLinSolverMagmaBuka error: 191 entry in the factorization's diagonal
is exactly zero. Division by zero will occur if it a solve is attempted.
[Warning] hiopLinSolverMagmaBuka error: 191 entry in the factorization's diagonal
is exactly zero. Division by zero will occur if it a solve is attempted.
[Warning] KKT_MDS_XYcYd linsys: Detected negative eigenvalues in (1,1) sparse block.
  83 -1.3559463e+03 1.260e-02  5.259e-03  -2.55  1.000e+00  1.000e+00  1(f)
[Warning] hiopLinSolverMagmaBuka error: 191 entry in the factorization's diagonal
is exactly zero. Division by zero will occur if it a solve is attempted.
[Warning] hiopLinSolverMagmaBuka error: 191 entry in the factorization's diagonal
is exactly zero. Division by zero will occur if it a solve is attempted.
[Warning] KKT_MDS_XYcYd linsys: Detected negative eigenvalues in (1,1) sparse block.
  84 -1.3560463e+03 4.662e-02  2.022e-03  -3.82  1.000e+00  1.000e+00  1(h)
[Warning] hiopLinSolverMagmaBuka error: 191 entry in the factorization's diagonal
is exactly zero. Division by zero will occur if it a solve is attempted.
[Warning] hiopLinSolverMagmaBuka error: 191 entry in the factorization's diagonal
is exactly zero. Division by zero will occur if it a solve is attempted.
[Warning] KKT_MDS_XYcYd linsys: Detected negative eigenvalues in (1,1) sparse block.
Panic: minimum step size reached. The problem may be infeasible or the gradient inaccurate. Will exit here.
  85 -1.3560463e+03 4.662e-02  2.028e-03  -3.82  1.000e+00  5.551e-17  54(?)
Couldn't solve the problem.
Linesearch returned unsuccessfully (small step). Probable cause: inaccurate gradients/Jacobians or infeasible problem.
Total time 6.014 sec 
Hiop internal time:     total 6.002 sec     avg iter 0.071 sec 
    internal total std dev across ranks 0.000 percent
Fcn/deriv time:     total=0.005 sec  ( obj=0.001 grad=0.000 cons=0.001 Jac=0.002 Hess=0.001) 
    Fcn/deriv total std dev across ranks 0.000 percent
Fcn/deriv #: obj 172 grad 86 eq cons 173 ineq cons 173 eq Jac 86 ineq Jac 86
Total KKT time 5.986 sec 
	update init 0.001sec     update linsys 0.293 sec     fact 5.673 sec 
	solve rhs-manip 0.013 sec     triangular solve 0.005 sec 

solve4 trouble: returned -4 (with objective is -1.356046289760e+03)
srun: error: dl08: task 0: Exited with exit code 255

Following dependencies have been used to build HiOp:

$ module list
Currently Loaded Modulefiles:
  1) gcc/7.3.0              3) cmake/3.15.3           5) metis/5.1.0
  2) cuda/10.2.89           4) openmpi/3.1.3          6) magma/2.5.2_cuda10.2

Please let me know what additional data would be helpful.

CC @ashermancinelli

Build issue

Two reports of build failure came in the last week via email. Coincidentally, I've just encountered the same exact problem on summit (using cmake 3.9.2)

CMake Error at CMakeLists.txt:7 (cmake_policy):
  Policy "CMP0074" is not known to this version of CMake.


-- The C compiler identification is GNU 4.8.5
-- The CXX compiler identification is GNU 4.8.5
(...)
-- Found LAPACK libraries: /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-9.1.0/openblas-0.3.9-aymovpat33osbzgh5gsmhyvstsol4sfp/lib/libopenblas.so;/autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-9.1.0/openblas-0.3.9-aymovpat33osbzgh5gsmhyvstsol4sfp/lib/libopenblas.so
CMake Error at src/Optimization/CMakeLists.txt:2 (target_link_libraries):
  Object library target "hiopOptimization" may not link to anything.


-- Configuring incomplete, errors occurred!
See also "/ccs/home/cpetra/work/projects/hiop/build/CMakeFiles/CMakeOutput.log".
See also "/ccs/home/cpetra/work/projects/hiop/build/CMakeFiles/CMakeError.log".

@ashermancinelli

Review RAJA kernels

Review RAJA kernels (currently in raja-dev branch) and flag potential bottlenecks. The purpose of this review is pre-screening of potential performance issues to give us heads up what to pay attention to when profiling the performance.

The current implementation of RAJA kernels was done with objective to ensure accurate computations. Some kernels, such as hiopVectorRajaPar::projectIntoBounds, are implemented in a way that is not quite "GPU friendly". Help from RAJA developers in identifying other potential bottlenecks and suggestions how to implement these kernels better is very much appreciated.

RAJA kernels are implemented in following HiOp classes:

  • hiopVectorRajaPar
  • hiopMatrixRajaDense
  • hiopMatrixRajaSparseTriplet
  • hiopMatrixRajaSymSparseTriplet

Currently, RAJA kernels run only within unit tests. See tests in:

  • testVector.cpp
  • testMatrix.cpp
  • testMatrixSparse.cpp

CC @davidbeckingsale @rhornung67

DPOSVXEpsilon1Full parameter with illegal value

Hi,

I'm using hiop library together with MFEM to solve a optimization problem that involves the resolution of a PDE. The optimization works very well and in few iteration a get convergence. However, I recieve the following warning after each iteration and I don't know what it means to be able to remove it.

"On entry to DPOSVXEpsilon1Full parameter number 6 had an illegal value"

thanks for your time,
Jesus

HiOp GPU-enabled build fails

My HiOp build from dev/NewtonMDS branch fails at compile stage with message:

make[2]: *** No rule to make target `/usr/lib64/libopenblas.so -lmagma -L/share/apps/cuda/9.2/lib64 -lculibos -lcublas -lcublasLt -lnvblas -lcusparse -lcudart -lcudadevrt', needed by `src/LinAlg/test_hiopLinAlgComplex.exe'.  Stop.

It seems that there is some mess-up with CMake paths. I used following configuration for build:

CC=mpicc CXX=mpicxx FC=mpif90 cmake              \
-DHIOP_USE_MPI=1                                 \
-DHIOP_USE_GPU=1                                 \
-DHIOP_MAGMA_DIR="/.../exasgd/newell/magma" \
-DCMAKE_INSTALL_PREFIX=$HIOP_DIR                 \
../hiop

I used cmake 3.13.4, gcc 7.4.0, openmpi 3.1.3, cuda 9.2, OpenBLAS 0.3.3, and LAPACK 3.4.2 for the build. Configure part works fine, but something goes wrong with complex linear algebra compilation.

Verbose output gives me this:

make -f src/LinAlg/CMakeFiles/test_hiopLinAlgComplex.exe.dir/build.make src/LinAlg/CMakeFiles/test_hiopLinAlgComplex.exe.dir/depend
make[2]: Entering directory `/.../exasgd/src/hiop/build_newell'
cd /.../exasgd/src/hiop/build_newell && /.../cmake/3.13.4/bin/cmake -E cmake_depends "Unix Makefiles" /.../exasgd/src/hiop/hiop /.../exasgd/src/hiop/hiop/src/LinAlg /.../exasgd/src/hiop/build_newell /.../exasgd/src/hiop/build_newell/src/LinAlg /.../exasgd/src/hiop/build_newell/src/LinAlg/CMakeFiles/test_hiopLinAlgComplex.exe.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/.../exasgd/src/hiop/build_newell'
make -f src/LinAlg/CMakeFiles/test_hiopLinAlgComplex.exe.dir/build.make src/LinAlg/CMakeFiles/test_hiopLinAlgComplex.exe.dir/build
make[2]: Entering directory `/.../exasgd/src/hiop/build_newell'
make[2]: *** No rule to make target `/usr/lib64/libopenblas.so -L/.../exasgd/newell/magma/lib -lmagma -L/.../cuda/9.2/lib64 -lculibos -lcublas -lcublasLt -lnvblas -lcusparse -lcudart -lcudadevrt', needed by `src/LinAlg/test_hiopLinAlgComplex.exe'.  Stop.

solve() method implemented in hiopLinSolver.hpp file.

Should we consider moving implementation of the solve() method below from hiopLinSolver.hpp to a source file? Simplifying API would likely help porting to GPU and managing compile-time dependencies. See also #43.

  void solve ( hiopVector& x_ )
  {
    assert(M.n() == M.m());
    assert(x_.get_size()==M.n());
    int N=M.n(), LDA = N, info;
    if(N==0) return;

    hiopVectorPar* x = dynamic_cast<hiopVectorPar*>(&x_);
    assert(x != NULL);

    char uplo='L'; // M is upper in C++ so it's lower in fortran
    int NRHS=1, LDB=N;
    DSYTRS(&uplo, &N, &NRHS, M.local_buffer(), &LDA, ipiv, x->local_data(), &LDB, &info);
    if(info<0) {
      nlp->log->printf(hovError, "hiopLinSolverIndefDenseLapack: DSYTRS returned error %d\n", info);
      assert(false);
    } else if(info>0) {
      nlp->log->printf(hovError, "hiopLinSolverIndefDenseLapack: DSYTRS returned error %d\n", info);
    }   
  }

CC @ashermancinelli

Developer Readme Tests

In README_developers.md the required tests involve using clang tools for address sanitization. Clang does not work so readily with openmp which is now a dependency for the new RAJA linear algebra library. To run these we have to disable HIOP_USE_RAJA. @cnpetra would you still like to have clang tools checks for the non-raja parts of hiop or should we pursue other tests as prerequisites for submitting PRs?

Symmetric Sparse Matrix Kernel Implementations

addToSymDenseMatrixUpperTriangle and transAddToSymDenseMatrixUpperTriangle for the symmetric sparse triplet classes both seem to not take into account the symmetric nature of the matrices when adding them to the output matrices.

For reference timesVec has a the following section of code that takes this into account:

y[iRow_[i]] += alpha * x[jCol_[i]] * values_[i];
if(iRow_[i]!=jCol_[i])
  y[jCol_[i]] += alpha * x[iRow_[i]] * values_[i];

A way of fixing this issue would be to have the existing addToSymDenseMatrixUpperTriangle look something like the following, with a similar fix for transAddToSymDenseMatrixUpperTriangle:

void hiopMatrixSymSparseTriplet::addToSymDenseMatrixUpperTriangle(int row_start, int col_start, 
						  double alpha, hiopMatrixDense& W) const
{
  assert(row_start>=0 && row_start+nrows<=W.m());
  assert(col_start>=0 && col_start+ncols<=W.n());
  assert(W.n()==W.m());

  double** WM = W.get_M();
  for(int it=0; it<nnz; it++) {
    assert(iRow[it]<=jCol[it] && "sparse symmetric matrices should contain only upper triangular entries");
    int i = iRow[it]+row_start;
    int j = jCol[it]+col_start;
    assert(i<W.m() && j<W.n()); assert(i>=0 && j>=0);
    assert(i<=j && "symMatrices not aligned; source entries need to map inside the upper triangular part of destination");
    WM[i][j] += alpha*values[it];
    if(iRow[it] != jCol[it])
    {
      i = jCol[it]+row_start;
      j = iRow[it]+col_start;
      assert(i<W.m() && j<W.n()); assert(i>=0 && j>=0);
      assert(i<=j && "symMatrices not aligned; source entries need to map inside the upper triangular part of destination");
      WM[i][j] += alpha*values[it];
    }
  }
}

If this fix is not implemented, only one half of the symmetric sparse matrix will be added to the destination matrix every time this function is called.

Magma interface implementation in hiopLinSolver.hpp

Magma solver interface is implemented in hiopLinSolver.hpp what seems to pollute HiOp API. One consequence of that is that any application depending on HiOp will depend on Magma API and will need Magma include files to build (if HiOp is built with Magma). Perhaps it would be good idea to move the implementation of Magma interface to a separate compilation unit (a *.cpp file)?

@cnpetra @ashermancinelli

Update Tags

I am installing updated hiop versions on our PNNL machines. Would now be an appropriate time to update the tags to v0.3? Updating the tags after every large PR would be helpful in tracking versions. I think v0.2 points at c52a6f6 which is from last December.

MPI matrix multiplication fails

For a minimal example, I setup:

A_{M \times N} \quad X_{N \times N} \quad W_{M \times N}

such that W may be a clone of A:

hiop::hiopMatrixDense A(M_local, N_global, partition, comm);
hiop::hiopMatrixDense X(N_global, N_global, partition, comm);
hiop::hiopMatrixDense* W = A.alloc_clone();

A.setToConstant(1.);
W.setToConstant(1.);
X.setToConstant(1.);

// Beta = 0 to just test matmul portion
A.timesMat(0., *W, 1., X);

I expect W to have all it's elements set to N_global:

//     W        = 0 * W + A  * X
double expected =         1. * 1. * N_glob;

This succeeds when running on a single machine. When I attempt to run this in an MPI environment however, the following assertion in timesMat_local is thrown:

assert(W.n_local==W.n_global && "requested multiplication should be done in parallel using timesMat");

Error accumulation in sequential `logBarrier` vector method

Method hiopVectorPar::logBarrier seems to accumulate error and fails unit tests. In the current unit test the error is of order 1e-11 whereas expected accuracy is around machine precision (as is the case for all other vector kernel tests).

Consider modifying sequential algorithm to use Kahan summation. Check whether optimization -O2 or -O3 would destroy the precision restoration in Kahan summation.

According to @cnpetra, for logBarrier method accuracy is more important than performance, because it directly affects convergence of the overall algorithm.

Suppressing GPU Fact info when using MAGMA

When running HiOp with MAGMA, the following output is being printed to stdout.

GPU FACT in 0.0493771 sec at TFlops: 0.000712128
GPU FACT in 0.0434769 sec at TFlops: 0.00080877
GPU FACT in 0.0436565 sec at TFlops: 0.000805443
GPU FACT in 0.043938 sec at TFlops: 0.000800282
GPU FACT in 0.0430558 sec at TFlops: 0.00081668
GPU FACT in 0.0432639 sec at TFlops: 0.000812752
GPU FACT in 0.0433745 sec at TFlops: 0.00081068
GPU FACT in 0.0432896 sec at TFlops: 0.000812269
GPU FACT in 0.0591214 sec at TFlops: 0.000594756
GPU FACT in 0.0609099 sec at TFlops: 0.000577292

I'd like this output to be suppressed and perhaps be allowed based on the verbose level.

I think it is printed via this line

MA86 Z headers not working with C++

MA86 Z headers are not working with C++ when std::complex includes are present.

A temporary dirty-ish fix below.

  1. In the MA86 'include' directory, copy 'hsl_mc69z.h" and "hsl_ma86z.h" to 'hsl_mc69z.hpp" and "hsl_ma86z.hpp"

Edit 'hsl_mc69z.hpp" and "hsl_ma86z.hpp" as in 2, 3, and 4 below.

  1. After #include <complex.h> add these lines
#ifdef __cplusplus
extern "C" {
#endif
  1. replace "complex" with "_Complex" whenever occurs in "double complex" or "complex double"

  2. before the last #endif insert these lines

#ifdef __cplusplus
}
#endif

Sparse algebra interface and lambda .= 0 at first iteration

Finally, I have good description of this case. The MOI interface is working fine using the dense algebra. With the sparse one, I have the following issue.

I pass this sparsity pattern to Hiop:

iHSS = Int32[0, 1, 2, 6, 9, 15, 18, 7, ..., 14, 16, 21, 22, 23, 9, 13, 14, 18, 22, 23]
jHSS = Int32[0, 1, 2, 6, 6, 6, 6,..., 22, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23

The relevant entries are only the first 3.

At the first iteration on the IEEE 9 bus case with 3 generators and the lambdas being initialized to 0 I get the following Hessian:

nzval = [2200.0, 1700.0, 2450.0, 0.0, ..., -0.0, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0]

Because the multipliers lambda are 0 this seems right to me, given there are 3 generators.

Hiop then aborts with this message:

iter    objective     inf_pr     inf_du   lg(mu)  alpha_du   alpha_pr linesrch
   0  8.3631250e+03 1.550e+00  3.530e+03  -1.00  0.000e+00  0.000e+00  -(-)
[Warning] hiopLinSolverIndefDense error: 11 entry in the factorization's diagonal
is exactly zero. Division by zero will occur if it a solve is attempted.
[Warning] hiopLinSolverIndefDense error: 11 entry in the factorization's diagonal
is exactly zero. Division by zero will occur if it a solve is attempted.
julia: /scratch/mschanen/git/hiop/src/Optimization/hiopPDPerturbation.hpp:184: bool hiop::hiopPDPerturbation::compute_perturb_singularity(double&, double&, double&, double&): Assertion `delta_cc == 0. && delta_cd == 0.' failed.

If I set the multipliers to nonzero at the first iteration, it works fine, and the problem even converges to the right solution. So I'm wondering what is going on? Is this expected behavior?

Thanks!

Insufficient Documentation for hiopMatrixSymSparseTriplet::transAddToSymDenseMatrixUpperTriangle

For the fuction transAddToSymDenseMatrixUpperTriangle, it is unclear given the current documentation which matrix should be transposed. From what is in the kernel, it appears as though it is the symmetric sparse triplet matrix that is transposed. If this is the case, then this may be related to issue #77.

If it is the output matrix that should be transposed, then the kernel should be modified, and documentation should be updated accordingly.

GPU compute mode option

It would be good to add "gpu" as a "compute_mode" option in HiOp. However, this option is not independent of "mem_space" selection. Here is the summary of compute mode and memory space option compatibilities:

  1. "cpu" option works only with "default", "host" and likely "um" memory space options.
  2. "hybrid" works with all but "device" memory space option.
  3. "gpu", if added, would work with "um", "pinned" or "device" options.

We can simply document this and expect user to select a meaningful combination or we can add some logic to HiOp options class that would warn user or when incompatible combination was selected and fall back to the next best thing.

If HiOp is built without RAJA support, only "default" memory space is available.

Options "um", "pinned" or "device" are available only if HiOp is built with GPU support (for now it's CUDA only) turned on.

CC @ashermancinelli @cnpetra

API for hiopMatrixSparseTriplet has implementation specific arguments

Some of the methods of hiopMatrixSparseTriplet class have an argument, which is a reference to a specific matrix implementation (see e.g. transAddToSymDenseMatrixUpperTriangle method). This could potentially lead to cumbersome solutions when porting this class to hardware accelerators (e.g. GPU).

A minimally invasive way to go about this would be to add an enum to matrix base class with matrix type IDs, as well as a virtual method to return the matrix type ID. Similar was done in SUNDIALS.

With such modification, method transAddToSymDenseMatrixUpperTriangle can take reference to virtual hiopMatrix class as the input argument, and then check in the implementation if a compatible matrix type was passed. The implementation of transAddToSymDenseMatrixUpperTriangle can then select computation specific to the matrix layout or throw an exception if matrix type is incompatible.

This could keep API cleaner and provide more extensibility. The downside of this approach is that passing an incompatible matrix type would be caught at runtime instead of at compile time. A more comprehensive solution would be to use template parameters to specify matrix layout, but that would require more significant changes to the code.

@cnpetra @ashermancinelli

hiopMatrixDense shiftRows segfault

When testing the shiftRows method of hiopMatrixDense, I am running into a segfault on our Power9 systems:

233	    A.shiftRows(shift);
(gdb) 

Program received signal SIGSEGV, Segmentation fault.
0x0000000010017474 in hiop::hiopMatrixDense::shiftRows (this=0x7fffffffe410, shift=4) at /people/manc568/projects/hiop/src/LinAlg/hiopMatrix.cpp:256
256	    assert(test1==M[shift<0?0:m_local][0] && "a different copy technique than memcpy is needed on this system");

Should we add a fallback method in case memcpy will not work? How do you suggest we test this method? @cnpetra

hiopMatrixSparse interface decisions

In our internal branch, the linear algebra factory has a method (createMatrixSparse) to create an instance of the abstract class hiopMatrixSparse, choosing the appropriate implementation (e.g. RAJA vs default).

Seeing that there are other implementations of sparse matrices, how should we alter the factory class API to create other kinds of sparse matrices?

A few options:

  • Pass options as template parameters (storage type of csr/coo, complex/real-only, sym/non-sym, etc)
  • Pass options as parameters of enum types
  • Pass options as parameters of string types
  • Create different factory methods for different types of sparse matrices
    • i.e. methods like createMatrixSparseSym, createMatrixSparse, and others

CC @pelesh

timesMat segmentation fault

Setting up three matrices like so:

hiop::hiopMatrixDense A(M_global, K_global, k_partition, comm);
hiop::hiopMatrixDense M(K_global, N_global, n_partition, comm);
hiop::hiopMatrixDense W(M_global, N_global, n_partition, comm);

I then set values and attempt to call timesMat:

A.setToConstant(A_val);
W.setToConstant(W_val);
M.setToConstant(M_val);
A.timesMat(beta, W, alpha, M);

real_type expected = (beta * W_val) + (alpha * A_val * M_val * N_global);
const int fail = verifyAnswer(&W, expected);

Which results in a segfault.

Cannot retrieve constraint multipliers from solution_callback

I am trying to access the constraint multipliers through solution_callback, but the multipliers array lam (and even the constraints arrays g) is NULL. Below is the debugging frame which shows that g and lam passed to solution_callback are NULL.

Process 56591 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x00000001001411f5 libopflow.dylib`OPFLOWHIOPInterface::solution_callback(this=0x000000010460a430, status=Solve_Success, n=24, xsol=0x000000010460d140, z_L=0x000000010460dd10, z_U=0x000000010460de10, m=36, g=0x0000000000000000, lam=0x0000000000000000, obj_value=5297.4067102865956) at opflow-hiop.cpp:441:18

This is with running the NewtonMDS solver.

I am missing setting something or does this need to be added to HIOP?

Consider simplifying how dense matrix is accessed in HiOp interface

HiOp is using array of row pointers to access dense matrix data through its interface (e.g. constraint Jacobian data). Perhaps more flexible solution, especially for GPU implementations, would be to pass a raw pointer to matrix data.

Such change would imply all matrix rows are stored in a contiguous memory block. This assumption is actually made at several places in HiOp code, so passing pointer to data block instead of an array of row pointers would make it more clear to user how to store Jacobian data, in addition to making it more GPU friendly.

compilation error

with cmake -DHIOP_USE_MPI=NO .. I get the compilation issue below:

[ 86%] Building CXX object tests/CMakeFiles/testMatrix.dir/testMatrix.cpp.o
In file included from /ccs/home/cpetra/work/projects/hiop/tests/LinAlg/matrixTestsDense.hpp:59:0,
                 from /ccs/home/cpetra/work/projects/hiop/tests/testMatrix.cpp:61:
/ccs/home/cpetra/work/projects/hiop/tests/LinAlg/matrixTests.hpp:63:20: fatal error: optional: No such file or directory
 #include <optional>
                    ^
compilation terminated.
make[2]: *** [tests/CMakeFiles/testMatrix.dir/testMatrix.cpp.o] Error 1
make[1]: *** [tests/CMakeFiles/testMatrix.dir/all] Error 2
make: *** [all] Error 2

`copyFrom` and similar methods are not part of the abstract hiopMatrix interface

Heavy use is made of copyFrom and similar methods, yet the abstract interface does not mandate that a hiopMatrix implementation has this method. In contrast, hiopVector does have copyTo and copyFrom methods in the abstract interface. I believe this could lead to implementation-specific code which will cause problems when we attempt to migrate to other implementations for hiopMatrix.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.