Giter Club home page Giter Club logo

rnr-refresh's Introduction

rnr-refresh

Reanalysis and Reforecasting: a Research Environment for Flexible Reanalysis Experiments on Systems that are Heterogeneous

rnr-refresh's People

Contributors

sk8forether avatar

Stargazers

 avatar

Watchers

Adam Schneider avatar  avatar Jeffrey Whitaker avatar Chesley McColl avatar Niraj Agarwal avatar  avatar  avatar

rnr-refresh's Issues

Hera Cylc 8 Platform Configuration

Description

For the MVP, create a minimal platform configuration for the RDHPCS Hera platform that can be used when running the Launch task.

Background

Previously, the UFS-RNR software was not designed to be flexibly cross-platform, and we want to design that into rnr-refresh from the outset.

Next Steps

Start with understanding the Cylc 8 Platform Configuration options, and customize it as needed to get the Launch task functioning at a minimum.

Basic Testing Experiment Configuration

Description

We need a very basic starting point for running an experiment, and the configuration file is a good place to point new users to when they are getting started. For the sake of accomplishing the MVP, this just needs to contain the basic information that gets read in to run the Launch task.

Remaining Questions

  • What is the best format for these configs? Should we use YAML? INI for Cylc (no python parsing)? Something else?
  • Will we need multiple configs for the same experiment if running on different platforms? Or can the platform be specified at runtime (with configurations already set up for the platforms)?

Looking Forward

After this is complete, we should consider turning this basic experiment into a good testing experiment and/or a quick-start for learning the software.

Implement some python to run Cylc 8 tasks

Description

For the sake of accomplishing the MVP, this just needs to be able to

  • Read in a configuration file
  • Call Cylc to run the Launch task

Remaining Questions

  • Class structure for Launch task?
  • Will any python code be needed within the Launch task? How do we avoid redundancy?

Next Steps

After this is complete, we should consider adding more robust testing and debug output, including errors and logs.

Create a `global.cylc` file for project

Description

Need to create a global.cylc file for the rnr-refresh software to enable all tasks and workflows can access available platforms with the same configurations. This is not directly a part of the generic launch task MVP, but will enable it. This task will need to define configurations for:

  • scheduler(s)
  • platforms and (potentially) platform groups
  • task events (i.e. mail)

Background

Previously (in Cylc 7), the Cylc "suite" was defined in a suite.rc file. In Cylc 8, Cylc "workflows" have replaced them, and are defined in flow.cylc files that define the Workflow Configuration. Every workflow first references the global.cylc configuration. Thus, storing core configurations that each workflow will use in this global configuration makes the most sense.

From the doc on Global Configurations in the platforms configuration section:

Many of these settings have replaced those of the same name from the old Cylc 7 suite.rc[runtime][<namespace>][job]/[remote] and global.rc[hosts][<host>] sections.

Here's an example of part of a global config file that was used in Cylc 7 for UFS-RNR (at least, it appears that Steve Lawrence used it).

Next Steps

  • Understand global.cylc files in Cylc 8
  • Create one for rnr-refresh
  • Test that it works when called directly by Cylc on the command-line (cylc config)
  • Test that it works for 2 platforms

Decide on UFS Model version to start out with

Description

The rnr-refresh software needs to run the UFS Model starting at a version we know we can use going forward for testing our DA methods and especially our Large Ensemble runs. An MVP for the Forecast task should have a least 1 functional model version with a reasonable namelist and scientific workflow.

Background

The UFS-RNR software was typically tied to a specific UFS model version per tag, and any updates to the model version required some code re-writing and extensive testing. The last UFS version compiled and supported in UFS-RNR was p7c, while Phil is using p8+ for Reply runs on Azure. The UFS codebase is continually updating, and newer versions are available all the time. We can reasonably expect to have to update the version at some point in the future.

Looking Forward

  • Do we want to follow the same method of tying a supported model version to a release of rnr-refresh? Such as saying "version 1.0.0 of rnr-refresh only supports using UFS version p8b" or something.
    • If so, do we allow the flexibility to use other versions, just be clear that they are not supported?
    • If not, do we want to maintain multiple versions of the UFS model that we support? How do we develop a testing plan for saying which, if any, versions of UFS are supported with a release?

Cylc file for generic Launch task

Description

Need to create a flow.cylc style file for the generic launch task as part of the MVP. This task will need to:

  • create directories
  • create a json object of inputs

Background

Previously (in Cylc 7), the Cylc "suite" was defined in a suite.rc file. In Cylc 8, Cylc "workflows" have replaced them, and are defined in flow.cylc files that define the Workflow Configuration.

Next Steps

  • Understand flow.cylc files in Cylc 8
  • Create one for the Launch task
  • Test that it works when called directly by Cylc on the command-line
  • Test that it works correctly when called by python
  • Test that it works for 2 platforms

Azure Cylc 8 Platform Configuration

Description

For the MVP, create a minimal platform configuration for the Parallel Works (PW) Azure platform that can be used when running the Launch task. This is possibly non-trivial as it may require a new configuration for each new PW cluster we create. If possible, we may want to consider how to make this more flexible from the outset as well.

Background

Previously, the UFS-RNR software was not designed to be flexibly cross-platform, and we want to design that into rnr-refresh from the outset.

Next Steps

Start with understanding the Cylc 8 Platform Configuration options, and customize it as needed to get the Launch task functioning at a minimum. This platform should probably be configured after the Hera configuration is accomplished, just because it is likely to be more complex. At a minimum, it should work for the Launch task on at least one Azure cluster, and may have to define a new issue to make it more flexible later.

Difficulty with passing environment variables between Jinja2 and Cylc 8 config file sections

Description of Problem

It turns out that in Cylc 8 it is non-trivial to pass certain types of variables around within the global, workflow, or task configuration, even within the same file. And it becomes even more difficult to do this within conditional statements, user environment variables, or cylc runtime environment variables.

The Cylc 8 docs are littered with examples of using Jinja2 conditional statements to do all kinds of nifty things within the workflow, but these conditional statements have harsh limitations about which types of variables you can use. Cylc's implementation of Jinja2 also limits which features are available in the global configuration, thus causing one to not be able to do much passing back and forth without the configuration failing. This issue is mostly to document a bunch of testing that was done for the purposes of using these conditional statements within the global config for various purposes, including customization of runtime configurations based on platform selection (will be covered more in another issue).

Testing

(TODO: Will add more details here)

References

From the docs

Stack Overflow Questions

Environment Variables

Note that all of these methods were tested and failed due to Cylc's specialized implementation of Jinja2

Logic operators

Other

Decide on how to organize obs staging

Description

We will need to decide how to perform this task, after taking a closer look at how it was done in UFS-RNR.

Background

Todo

Looking Forward

Todo

Decide on what the new "Launch" task should accomplish

Description

The rnr-refresh software needs to reproduce at least some of the base functionality of the UFS-RNR software to be able to launch a Cylc workflow (in v8, previously called a "suite" in Cylc v7). A minimum viable product (MVP) of the launch task to demonstrate how rnr-refresh will work constitutes being able to accomplish the launch task on at least 2 platforms: 1 HPC system (Hera), and 1 Parallel Works cloud (Azure). This issue is to capture the decision-making process of what we will include in the new launch task as we compare and work off of the UFS-RNR version.

Background

In UFS-RNR, the main python runscript is called with an experiment YAML input, and the launch task itself is accomplished across several files. The main runscript (via the run function in the CylcRun class) does the following:

  • collects CLI args
  • collects vars from YAML
  • "builds" cylc suite
  • launches cylc suite (via a call to the launch_cylc function, which in turn calls the run function in cyclutil.cylc_interface.CylcLauncher)

The cylcutil Class is imported from ush by the main runscript. The run function in CylcLauncher does the following:

  • calls register_suite function, which
    • adds to log and error files in specific directories
    • sends a bash command via subprocess that calls the cylc_app class to register the cylc suite with the YAML object for the experiment, the suite path, and run dir
  • calls the run_suite function, which
    • adds to the same log and error files
    • sends another bash command via subprocess that calls cylc_app again with the run command and the YAML experiment object

Those Cylc commands in turn initiate the Cylc suite(s) including the runtime suite (where the launch task is actually defined), which then:

  • initiates a bash script for the job, which then
  • runs a python script for the job, which then
    • creates a launcher from ufs_rnr.launcher.Launcher function
    • calls the run function for the Launcher

The actual Launcher Class run function then calls other functions to:

  • build directories for the experiment
  • create a json file for other tasks to reference
  • calls run function from the class UFSRNRCylcStatus, which then does some more Singularity-related tasks (not including a full list here)

Relevant files in UFS-RNR

These links are stored here as a reference for mapping out the Launch task functionality present in UFS-RNR, and to help facilitate decisions about what to bring forward (and how).

Starting

The launch task is a core task that is used in all runs of UFS-RNR, thus the best path to follow is starting from a baseline experiment. The experiments are initiated by the runscript with a YAML path input. The configuration files and the runscript can be found in the cylc directory as follows:

Python cylcutil Class

The cylcutil class is imported by the main run script, which launches the cylc suite.

Cylc Suite Files

Several Cylc suites are relevant because there is a main suite for the experiment combined with other general runtime suites.

Environment Files

Environment files that are used in the bash job are platform-specific, even for the task itself.

Job Files

Python ufs_rnr Class

Looking Forward

Basically, the question how much do we want to reproduce and/or bring forward?

How to customize runtime configurations to be dependent on platform choice

Description

The basic idea is that we need to decide how to smoothly switch between runtime configurations based on the selected platform for the workflow. Cylc 8 enables customizing the runtime environment to match the platform, and - hypothetically - attempts to make it easy to customize platform configurations. With the failures in testing #11, it presents some issues about how smoothly configuring this into Cylc 8 is going...

TODO: Add more details.

References

TODO: Will add a bunch of open tabs...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.