csdms / dakotathon Goto Github PK

A Python API and BMI for the Dakota iterative systems analysis toolkit

License: MIT License

Python 100.00%

python dakota uncertainty-quantification sensitivity-analysis csdms bmi

dakotathon's Introduction

Dakotathon

Dakotathon provides a Basic Model Interface and a Python API for a subset of the methods included in the Dakota iterative systems analysis toolkit, including:

Dakotathon is currently beta-level software supported on Linux and macOS. API documentation is available at http://csdms-dakota.readthedocs.io.

Installation

Install Dakotathon into an Anaconda Python distribution with

$ conda install -c csdms-stack dakotathon

or install from source with

$ git clone https://github.com/csdms/dakotathon.git
$ cd dakotathon
$ python setup.py install

Dakotathon requires Dakota 6.1 or greater. Install Dakota through conda with

$ conda install -c csdms-stack -c conda-forge dakota

or, follow the instructions on the Dakota website for downloading and installing a precompiled Dakota binary for your system.

Execution: standalone

Import Dakotathon into a Python session with:

>>> from dakotathon import Dakota

Create a Dakota instance, specifying a Dakota analysis method:

>>> d = Dakota(method='vector_parameter_study')

To run a sample case, create a Dakota input file from the default vector parameter study and call Dakota:

>>> d.write_input_file()
>>> d.run()

Dakota output is written to two files, dakota.out (run information) and dakota.dat (tabular output), in the current directory.

For more in-depth examples of using Dakotathon as a standalone Python package, see the Jupyter Notebooks in the examples directory of this repository.

Note

If you're using Anaconda IPython on macOS, include the DYLD_LIBRARY_PATH environment variable in your session before calling the run method with:

>>> from dakotathon.utils import add_dyld_library_path
>>> add_dyld_library_path()

See #17 for more information.

Execution: in PyMT

Dakotathon can also be called as a component in PyMT. For example, to perform a centered parameter study on the Hydrotrend component, start with imports:

import os
from pymt.components import CenteredParameterStudy, Hydrotrend
from dakotathon.utils import configure_parameters

then create instances of the Hydrotrend and Dakota components:

h, c = Hydrotrend(), CenteredParameterStudy()

Next, set up a dict of parameters for the experiment:

experiment = {
  'component': type(c).__name__,
  'run_duration': 10,                # years
  'auxiliary_files': 'HYDRO0.HYPS',  # the default Waipaoa hypsometry
  'descriptors': ['starting_mean_annual_temperature',
                  'total_annual_precipitation'],
  'initial_point': [15.0, 2.0],
  'steps_per_variable': [2, 5],
  'step_vector': [2.5, 0.2],
  'response_descriptors': ['channel_exit_water_sediment~suspended__mass_flow_rate',
                           'channel_exit_water__volume_flow_rate'],
  'response_statistics': ['median', 'mean']
}

and use a helper function to format the parameters for Dakota and for Hydrotrend:

cparameters, hparameters = configure_parameters(experiment)

Set up the Hydrotrend component:

cparameters['run_directory'] = h.setup(os.getcwd(), **hparameters)

Create the Dakota template file from the Hydrotrend input file:

cfg_file = 'HYDRO.IN'  # get from pymt eventually
dtmpl_file = cfg_file + '.dtmpl'
os.rename(cfg_file, dtmpl_file)
cparameters['template_file'] = dtmpl_file

Set up the Dakota component:

c.setup(dparameters['run_directory'], **cparameters)

then initialize, run, and finalize the Dakota component:

c.initialize('dakota.yaml')
c.update()
c.finalize()

Dakota output is written to two files, dakota.out (run information) and dakota.dat (tabular output), in the current directory.

For more in-depth examples of using Dakotathon with PyMT, see the Python scripts in the examples directory of this repository.

Contributing

Dakotathon is open source software, released under an MIT license. Contributions are welcome. Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

dakotathon's People

Contributors

Stargazers

Watchers

Forkers

siccarpoint kbarnhart rowhit davidchoi76 johnjasa crispdyt

dakotathon's Issues

Implement all keywords for polynomial_chaos and stoch_collocation methods

I've implemented a minimum set of keywords that allow these methods to work. For completeness, all the keywords available for polynomial_chaos and stoch_collocation should be implemented.

Default PolynomialChaos and StochasticCollocation experiments fail to update

The default PolynomialChaos and StochasticCollocation experiments fail to run. For example, entering this code:

from dakotathon.bmi import PolynomialChaos

m = PolynomialChaos()
m.initialize()
m.update()

results in a CalledProcessError.

Here is the dakota.in file produced by the code above:

# Dakota input file
environment
  tabular_data
    tabular_data_file = 'dakota.dat'

method
  polynomial_chaos
    sample_type = random
    samples = 10
    quadrature_order = 2

variables
  uniform_uncertain = 2
    descriptors = 'x1' 'x2'
    lower_bounds = -2.0 -2.0
    upper_bounds = 2.0 2.0

interface
  id_interface = 'CSDMS'
  direct
  analysis_driver = 'rosenbrock'

responses
  response_functions = 1
    response_descriptors = 'y1'
  no_gradients
  no_hessians

Running Dakota directly from the command line with this dakota.in file gives

$ dakota -i dakota.in -o dakota.out
Error: failure in parallel configuration lookup in Iterator::set_communicators().

Googling this error didn't help.

DYLD_LIBRARY_PATH is not present in IPython

Dakota uses DYLD_LIBRARY_PATH to reference its shared libraries on Mac OS X. It appears that Anaconda IPython removes this environment variable. It's present in Python, however. For example:

$ echo $DYLD_LIBRARY_PATH
/usr/local/dakota-6.1.0.Darwin.i386/bin:/usr/local/dakota-6.1.0.Darwin.i386/lib

$ which python ipython
/Applications/anaconda/bin/python
/Applications/anaconda/bin/ipython

$ python -c "import os; print(os.environ.get('DYLD_LIBRARY_PATH'))"
/usr/local/dakota-6.1.0.Darwin.i386/bin:/usr/local/dakota-6.1.0.Darwin.i386/lib
$ ipython -c "import os; print(os.environ.get('DYLD_LIBRARY_PATH'))"
None

As a workaround, I can set DYLD_LIBRARY_PATH with

os.environ['DYLD_LIBRARY_PATH'] = os.environ['DAKOTA_DIR'] + '/bin:' + os.environ['DAKOTA_DIR'] + '/lib'

For more information, see ipython/ipython#8878.

Check on conda channels used in build/install process

In response to install failures on Travis, I started using the conda-forge channel instead of the default channel to get dependent packages for dakotathon (see #48). My hunch is there's a bug in the default channel. When it's fixed, I'll revert to using it instead of conda-forge.

Add step_vector parameter to VectorParameterStudy

From the Dakota docs, either step_vector or final_point can be chosen as a parameter for a vector parameter study. Currently, only final_point has been implemented.

Re-enable install of Dakota binary on macOS for tests

The macOS Dakota binary in the csdms-stack channel is built against an obsolete version of openmpi. Once csdms-stack/dakota-recipe#3 is fixed, re-enable the installation of this binary for testing on Travis CI.

Dynamic naming of directories and files

Dakotathon presently does not support the dynamic naming of many directories and files. These include:

run directory
work directory
configuration file name
parameter file name
-output file name

Associated with this issue is hard coded placement of the configuration file, input file, and output file into the current directory (rather than a dynamically specified run directory which could default to the current directory).

This functionality would greatly improve dakotathon in two ways. By permitting a dynamic run_directory, a dakota object could be created and run by a looping script. By permitting a dynamic work_directory, the output can be placed in a different file structure (e.g. in a scratch directory).

PSUADE MOAT method

The Dakotathon package currently does not include support for the PSUADE MOAT sensitivity analysis method.

Performing optimization using Dakotathon?

We're interested in optimizing a Python model using Dakota for wind turbine design at NREL.
I see that most of Dakotathon's capabilities center around parameter studies or UQ.

Do you know of any forks or branches that include optimization capability through Dakota?
If not, could you comment on how much effort you think it would take to implement that in our own fork of Dakotathon?

Thanks! Please let me know if I should post this question elsewhere.

Use BMI v2.0

Dakotathon should be updated to use BMI v2.0 (bmipy=2.0).

Add BMI metadata for PSUADE MOAT method

Dakota's PSUADE MOAT method was added to dakotathon with #61. However, it needs BMI metadata for it to become fully integrated into the dakotathon component, and callable from PyMT.

Add BMI classes to documentation

The BMI classes in bmi.py aren't included in the documentation. They should be.

Support Windows

Dakota runs on Windows. Python runs on Windows. Most of the CSDMS software stack should currently run on Windows. Try to get Dakotathon running on Windows.

Add objective_functions response type

The objective_functions type of response is needed for optimization studies.

Use pytest instead of nose

nose is in maintenance mode; pytest is recommended as a replacement.

pytest can be dropped in as a direct replacement for the nosetest call, but almost all of the unit tests in the dakotathon package use nose assertions. These need to be replaced with regular assert statements.

Update numpydoc docstrings

I made some mistakes in creating docstrings, e.g., in documenting the __init__ methods of classes, such as VectorParameterStudy. See this reference for how I can fix these problems.

dakota.run() crashes when run in a test loop

I have used dakotathon to create a looping script that launches a series of dakota runs.

As a test, I ran this script with 6 iterations of the loop and the analysis_driver file modified so that dakota just created output files and running the model was skipped. Each iteration asks dakota to create a 100 sample experiment (so 600 run folders would be created).

Half way through running, this effort fully crashes my computer requiring a restart.

If I modify the script to initialize the dakota object, create the dakota.in file, and not execute dakota.run(), and then I run each of the six dakota.in files from the command line, I have no problems and all of the dakota runs complete in less time that it takes for my computer to crash.

I've been able to fix this problem by replacing the original run function with the following:

def run(self):
        """Run the Dakota experiment.

        Run is executed in the directory specified by run_directory keyword and
        run log and error log are created.
        """
        os.chdir(self.run_directory)

        with open(self.run_log, "w") as file_out:
            with open(self.error_log, "w") as error_out:
                subprocess.call(['dakota',
                                 '-i', self.input_file,
                                 '-o', self.output_file],
                                stdout=file_out,
                                stderr=error_out)

Here I've also added functionality for:

dynamic naming of the run directory
dynamic naming of a stdout file
dynamic naming of a stderr file

The stdout file provides the same functionality as the standard &> run.log call in a dakota command line call such as:
dakota -i dakota.in -o dakota.out &> run.log

This change to run() fixes the crashing problem on my computer.

The code I've used is pretty extensive, let me know if you'd like me to pare it down to provide a minimally complete example that reproduces this problem.

Fix test failing with Dakota 6.9

I recently built conda binaries for Dakota 6.9. However, using this version of Dakota (where previously v6.4 was used) causes a test, test_dakota.test_running_in_different_directory, to fail. The only difference between passing and failing is the Dakota version. This isn't quite a dakotathon issue, since I list support for Dakota 6.4, but the test should be updated to work for v6.9, as well.

Dakota aborts on Mac OS X when using 'sampling' method

Dakota aborts with

Trace/BPT trap: 5

when running an experiment with the sampling technique on Mac OS X. The console output shows a missing symbol:

dyld: lazy symbol binding failed: Symbol not found: _dpotrf_
  Referenced from: /usr/local/dakota-6.1.0.Darwin.i386/lib/libteuchos.dylib
  Expected in: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib

dyld: Symbol not found: _dpotrf_
  Referenced from: /usr/local/dakota-6.1.0.Darwin.i386/lib/libteuchos.dylib
  Expected in: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib

For a reproduce case, try dakota-experiments/experiments/hydrotrend-sampling. The same experiment runs to completion on Linux.

Removal of default parameters

Latin Hypercube Sampling adds default parameter values for probability levels. I don't want to specify probability levels, and thus a standard method for removing optional parameter values is requested.

I have found that:

d.method.probability_levels=()

Works to remove these parameters from the dakota.in file.

Would it be possible to provide recommendations for removing default parameter values?

Removal of blocks

In creating its output script, the dakota object always prints all of the standard blocks (e.g. 'method', 'variables', 'interface', 'responses'). It would be preferable to have a path to delete one or more of these blocks.

For example, in order to trick dakota to running in parallel on beach, my analysis_driver script launches a qsub call and then creates temporary output so that dakota will move on. After the actual model run is finished, this temporary output is overwritten with real output which is then used for subsequent analysis. Thus, I don't want to include the environment block with creates a tabular data structure.

Python 3 compatibility

The dakotathon package is not presently compatible with python 3. This is reasonable given the original intentions of what python versions dakotathon would support. There are only a few changes to be made that would provide compatibility.

See forthcoming PR.

This Issue supersedes #50 (which deals only with install issues).

Python 3 install issues due to print is a function, basestring vs str, and comma vs as

In Python 3 print is a function, so easy_install resulted in a few print related errors there was also an issue of needing to change a comma to an "as".

To run the most basic examples, I had to replace basestring with str.

These fixes are in a soon to be arriving PR.

Why is the dakota.yaml file created?

The dakota.yaml configuration file is not a file needed for standard command line operation with an analysis driver script. Yet dakotathon's dakota.setup() method creates both a yaml configuration file and the standard dakota.in input file.

Would it be possible to add a paragraph to the documentation that states when each of these files is used and when each of them is not needed.

In my use case I will be using looping through many hundreds of dakota runs so reducing the number of files created to the minimum is a valuable quality.

python fork analysis driver

I am using dakota in combination with python for a while now with the following setup:

# INTERFACE
interface,
   fork
   evaluation_scheduling peer static
   analysis_drivers = 'run_opt.py'
   parameters_file = 'params.in'
   results_file = 'results.out'
   #asynchronous evaluation_concurrency = 24
   aprepro

The "run_opt.py" file is very similar to dakota's python example.

Now, I would like to use the fork interface with dakotathon.
Though I am not sure if this is possible at all? It would be great, if you can give me a hint!

In addition, can somehow write, how actively is dakotathon further developed!?

Thanks in advance!
Fab

Allow combinations of variable types

Currently, only one type of variable is allowed in an experiment. For example, this:

variables
  uniform_uncertain = 2
    lower_bounds = -1.0 -1.0
    upper_bounds = 1.0 1.0
    descriptors = 'x1' 'x2'

is a variable block that can be produced by this package. However, Dakota supports compound blocks; for example, from the delft3d-polynomial-chaos-1 experiment in mdpiper/dakota-experiments:

variables
  normal_uncertain = 2
    descriptors   'Sand-SedDia'   'Silt-SedDia'
    means               1.0e-4          3.0e-5
    std_deviations     2.28e-5         2.73e-5
    lower_bounds       6.25e-5          8.0e-6
    upper_bounds        2.0e-4         6.25e-5
  uniform_uncertain = 1
    descriptors   'Mud-TcrEro'
    lower_bounds       1.0e-1
    upper_bounds       1.0e+0

We need to update the variables subpackage to allow compound variables blocks to be created.

Test against Python 3 on Travis CI

@kbarnhart has updated dakotathon to be compatible with Python 3. Be sure to test integration with Travis CI, where, currently, only Python 2 is tested.