Giter Club home page Giter Club logo

l2ss-py-autotest's Introduction

l2ss-py-autotest

This repository contains functional/integration tests for l2ss-py. It also includes github action workflows for automatically running these tests whenever a new collection gets associated to the l2ss-py UMM-S record.

How it works

  1. Every 5 minutes the cmr_association_diff.py script is run against UAT and OPS. This script looks at the collection concept ids in tests/cmr/l2ss-py/*_associations.txt and compares them to the associations in CMR (see diff.yml)
  2. For every collection concept id that exists in CMR association but does NOT exist in the .txt file in this repository, a new PR is opened in this repository with the new collection concept id as the title and branch name.
  3. When a pull request is created or updated in this repository and the base branch name starts with diff/uat or diff/ops, the tests will be executed for that collection (see verify.yml)
  4. The results of the test will be recorded as a status check for the PR
    1. If all tests pass: The pr will be labeled verified and automatically merged
    2. If any test fails or has an unknown error: The pr will be labeled bug and failed verification and will remain open
    3. If any tests are skipped: The pr will be labeled unverified and will remain open

What to do if tests fail

If a test fails, meaning an assertion did not succeed, or an unknown error occurs action must be taken. The cause of the failure should be determined and fixed. A failing test generally indicates an issue with either metadata or l2ss-py itself and may require additional steps. In some cases, the test may need to be updated to account for a unique edge case.

What to do if tests are skipped

Generally a skipped test indicates that verification was unable to complete. There are a few situations where tests get skipped (for example: in UAT if there are no UMM-Var records associated to the collection) When this happens, one of two things can be done:

  • Comment on the PR explaining why it is ok to not verify that collection and ask a repository admin to manually merge the PR
  • Fix the reason that caused the test to be skipped. For example, if it was skipped because there are no UMM-Var entries in UAT, then add UMM-Var entries to UAT and re-run the failed check

l2ss-py-autotest's People

Contributors

danielfromearth avatar frankinspace avatar github-actions[bot] avatar jamesfwood avatar jonathansmolenski avatar nlenssen2013 avatar podaac-cicd[bot] avatar sliu008 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

l2ss-py-autotest's Issues

Lat/lon variables can be in root group

Current code assumes all the variables are in the last group of the file. SNDR has lat/lon variables at the same level as some groups and gets missed. First check for lat/lon variables should be in the root group before checking the groups in line. This fixes SNDR ACOS and OCO3.

Regression test for OPS ISSUES

FAILED:
C1729925099-GES_DISC (ML2CH3CL)
C1627516298-GES_DISC (S5P_L2__NO2____HiR)
C1729925474-GES_DISC (ML2HOCL)
C1729925175-GES_DISC (ML2HCL)
C1729925263-GES_DISC (ML2HNO3)
C1729925101-GES_DISC (ML2CH3CN)
C1729926922-GES_DISC (ML2T)
C1729925154-GES_DISC (ML2H2O)
C1627516290-GES_DISC (S5P_L2__AER_LH_HiR)
C1239966791-GES_DISC (OMCLDRR)
C1239966787-GES_DISC (OMCLDO2)
C1239966829-GES_DISC (OMOCLO)
C1729925130-GES_DISC (ML2CO)
C2087216530-GES_DISC (S5P_L2__CH4____HiR)
C1239966755-GES_DISC (OMAERO)
C1729925083-GES_DISC (ML2BRO)
C2646932894-POCLOUD (CYGNSS_L2_SURFACE_FLUX_CDR_V1.2)
C1729925686-GES_DISC (ML2N2O)
C1627516287-GES_DISC (S5P_L2__CO_____HiR)
C1729925584-GES_DISC (ML2IWC)
C1239966842-GES_DISC (OMNO2)
C1442068505-GES_DISC (S5P_L2__CH4___)
C1239966818-GES_DISC (OMTO3)
C1729926181-GES_DISC (ML2OH)
C1239966768-GES_DISC (OMAERUV)
C1729925129-GES_DISC (ML2CLO)
C1729926699-GES_DISC (ML2SO2)
C1729925806-GES_DISC (ML2O3)
C1729925104-GES_DISC (ML2CH3OH)
C1729925368-GES_DISC (ML2HO2)
C1239966779-GES_DISC (OMHCHO)
C1239966794-GES_DISC (OMDOAO3)
C1239966810-GES_DISC (OMPIXCOR)
C1729925152-GES_DISC (ML2GPH)
C2832221740-POCLOUD (SMAP_RSS_L2_SSS_V6)
C1729926467-GES_DISC (ML2RHI)
C1729925178-GES_DISC (ML2HCN)
C1239966837-GES_DISC (OMSO2)
C1239966827-GES_DISC (OMO3PR)

Regression test for UAT ISSUES

FAILED:
C1215720341-GES_DISC (OMSO2)
C1229246436-GES_DISC (S5P_L2__SO2____HiR)
C1215720123-GES_DISC (OMCLDRR)
C1215720346-GES_DISC (OMCLDO2)
C1240921715-GES_DISC (S5P_L2__CH4____HiR)
C1234666455-GES_DISC (ML2CO)
C1234666477-GES_DISC (ML2IWC)
C1234666453-GES_DISC (ML2CH3OH)
C1229246434-GES_DISC (S5P_L2__O3_TOT_HiR)
C1234666469-GES_DISC (ML2O3)
C1229246431-GES_DISC (S5P_L2__NO2____HiR)
C1215720348-GES_DISC (OMAERO)
C1234666465-GES_DISC (ML2HCN)
C1234666476-GES_DISC (ML2HOCL)
C1234666458-GES_DISC (ML2H2O)
C1215720327-GES_DISC (OMOCLO)
C1215720121-GES_DISC (OMO3PR)
C1220280436-GES_DISC (S5P_L2__O3_TOT)
C1215720126-GES_DISC (OMNO2)
C1234666468-GES_DISC (ML2HO2)
C1220280437-GES_DISC (S5P_L2__SO2___)
C1229246435-GES_DISC (S5P_L2__HCHO___HiR)
C1220280440-GES_DISC (S5P_L2__AER_LH)
C1240921330-GES_DISC (S5P_L2__AER_AI_HiR)
C1234666340-GES_DISC (ML2CH3CL)
C1234666454-GES_DISC (ML2CLO)
C1240921714-GES_DISC (S5P_L2__AER_LH_HiR)
C1215720117-GES_DISC (OMAERUV)
C1240921350-GES_DISC (S5P_L2__CO_____HiR)
C1240921713-GES_DISC (S5P_L2__NO2____HiR)
C1234666333-GES_DISC (ML2BRO)
C1234666374-GES_DISC (ML2T)
C1234666344-GES_DISC (ML2CH3CN)
C1215720349-GES_DISC (OMDOAO3)
C1220280439-GES_DISC (S5P_L2__CH4___)
C1234666359-GES_DISC (ML2GPH)
C1215715440-GES_DISC (OMPIXCOR)
C1234666464-GES_DISC (ML2HCL)
C1236469821-GES_DISC (S5P_L2__O3_TOT_HiR)
C1215720343-GES_DISC (OMTO3)
C1220280430-GES_DISC (S5P_L2__HCHO__)
C1215720125-GES_DISC (OMHCHO)
C1236469820-GES_DISC (S5P_L2__CLOUD__HiR)
C1234666373-GES_DISC (ML2SO2)
C1234666467-GES_DISC (ML2HNO3)
C1220280433-GES_DISC (S5P_L2__NO2___)
C1234666479-GES_DISC (ML2N2O)
C1234666372-GES_DISC (ML2RHI)
SKIPPED:
C1229246433-GES_DISC ()
C1229246440-GES_DISC ()
C1240560242-GES_DISC (GPM_2AGPROFGPMGMI)
C1229246439-GES_DISC (S5P_L2__AER_LH_HiR)
C1254854962-LARC_CLOUD (TEMPO_O3TOT_L2)
C1261459666-OB_CLOUD ()
C1240560227-GES_DISC (GPM_2AGPROFAQUAAMSRE_CLIM)

Regression test for OPS ISSUES

Updated on 06-01-2024

FAILED:
C2646932894-POCLOUD (CYGNSS_L2_SURFACE_FLUX_CDR_V1.2)
C1918210292-GES_DISC (S5P_L2__SO2____HiR)
C2036882048-POCLOUD (CYGNSS_L2_CDR_V1.0)
C2832221740-POCLOUD (SMAP_RSS_L2_SSS_V6)
C2087216530-GES_DISC (S5P_L2__CH4____HiR)
C1729926922-GES_DISC (ML2T)
C1918210023-GES_DISC (S5P_L2__HCHO___HiR)
C1627516298-GES_DISC (S5P_L2__NO2____HiR)
C1442068508-GES_DISC (S5P_L2__SO2___)
C1442068505-GES_DISC (S5P_L2__CH4___)
C1729925806-GES_DISC (ML2O3)

Add a nightly job to re-run existing collections through the tests

In order to make sure that the existing collections don't start failing tests due to outside influences, create a nightly job to re-run existing collections through the tests.

This issue may run into trouble due to the sheer volume of collections and the API timing out. It may require some sort of pagination or batching so that not every job tries to run at once.

Update l2ss-py autotest test so it sends a message, warning the user if the umm-v coordinates are not setup properly (or missing)

We need this so we can see the issue directly in the github action instead of having to run the harmony request ourselves and look through the harmony error logs.

Should add it around here:
https://github.com/podaac/l2ss-py-autotest/blob/main/tests/verify_collection.py#L373

Not all DAACs will require umm-v cooridnates to be defined so we will just send a warning instead of failing. Unless all DAACs should have them defined. If so, then we should error early because of this.

Regression test for UAT ISSUES

Updated on 06-01-2024

FAILED:
C1234666374-GES_DISC (ML2T)
C1256946216-ASDC_DEV2 (CLARREO_SIMTEST_L1A)
C1236469820-GES_DISC (S5P_L2__CLOUD__HiR)
C1262900002-LARC_CLOUD (TEMPO_O3PROF_L2)
C1220280433-GES_DISC (S5P_L2__NO2___)
C1220280439-GES_DISC (S5P_L2__CH4___)
C1220280440-GES_DISC (S5P_L2__AER_LH)
C1240921715-GES_DISC (S5P_L2__CH4____HiR)
C1229246434-GES_DISC (S5P_L2__O3_TOT_HiR)
C1240921350-GES_DISC (S5P_L2__CO_____HiR)
C1229246436-GES_DISC (S5P_L2__SO2____HiR)
C1229246435-GES_DISC (S5P_L2__HCHO___HiR)
C1215720436-GES_DISC (OMNO2G)
C1240921330-GES_DISC (S5P_L2__AER_AI_HiR)
C1220280430-GES_DISC (S5P_L2__HCHO__)
C1229246431-GES_DISC (S5P_L2__NO2____HiR)
C1236469823-GES_DISC (S5P_L2__SO2____HiR)
C1240921713-GES_DISC (S5P_L2__NO2____HiR)
C1240921714-GES_DISC (S5P_L2__AER_LH_HiR)
C1258237271-POCLOUD ()
C1220280437-GES_DISC (S5P_L2__SO2___)
C1258816710-ASDC_DEV2 (PREFIRE_SAT1_2B-ATM)
C1220280436-GES_DISC (S5P_L2__O3_TOT)
C1236469821-GES_DISC (S5P_L2__O3_TOT_HiR)
C1238621219-POCLOUD ()
SKIPPED:
C1229246440-GES_DISC ()
C1240560227-GES_DISC (GPM_2AGPROFAQUAAMSRE_CLIM)
C1261459666-OB_CLOUD ()
C1254854962-LARC_CLOUD (TEMPO_O3TOT_L2)
C1229246439-GES_DISC (S5P_L2__AER_LH_HiR)
C1240560242-GES_DISC (GPM_2AGPROFGPMGMI)
C1229246433-GES_DISC ()

Add slack notification or Github Issue for nightly test failures/skips

We have a slack notification that looks like this, coming from Jenkins:

Failed to run Nightly Build for L2SS Notebook on the following collections
UAT: C1238687282-POCLOUD C1238538224-POCLOUD C1238538232-POCLOUD C1240817851-POCLOUD C1238538240-POCLOUD C1242735870-POCLOUD C1238538230-POCLOUD C1238538233-POCLOUD C1240739526-POCLOUD C1238538225-POCLOUD C1238658088-POCLOUD C1240739691-POCLOUD C1238543220-POCLOUD C1240739688-POCLOUD C1256420924-POCLOUD C1256420925-POCLOUD C1240739704-POCLOUD C1240739719-POCLOUD C1240739726-POCLOUD C1240739734-POCLOUD C1240739764-POCLOUD C1240739577-POCLOUD C1245295750-POCLOUD C1240739713-POCLOUD C1240739709-POCLOUD C1240739768-POCLOUD C1245295751-POCLOUD C1240739611-POCLOUD C1240739606-POCLOUD C1244459498-POCLOUD C1244810554-POCLOUD C1238621178-POCLOUD C1238621182-POCLOUD C1238621091-POCLOUD C1238538231-POCLOUD C1238543223-POCLOUD C1238538241-POCLOUD C1241042620-POCLOUD C1241042621-POCLOUD C1256524295-POCLOUD C1238621186-POCLOUD C1238658080-POCLOUD C1238658086-POCLOUD C1234208437-POCLOUD C1243175554-POCLOUD C1256507988-POCLOUD C1261072645-POCLOUD C1259115166-POCLOUD C1259115177-POCLOUD C1256122852-POCLOUD C1256507989-POCLOUD C1256783381-POCLOUD C1256507990-POCLOUD C1261072646-POCLOUD C1261072654-POCLOUD C1259115167-POCLOUD C1261072648-POCLOUD C1261072656-POCLOUD C1261072655-POCLOUD C1256783382-POCLOUD C1256783386-POCLOUD C1256783391-POCLOUD C1256783388-POCLOUD C1261072658-POCLOUD C1256445396-POCLOUD C1261072659-POCLOUD
OPS: C2068529568-POCLOUD C2746966926-POCLOUD C2601581863-POCLOUD C2601584109-POCLOUD C2799465428-POCLOUD C2746966927-POCLOUD C2601583089-POCLOUD C2799465497-POCLOUD C2746966657-POCLOUD C2799465507-POCLOUD C2784494745-POCLOUD
https://devops1.jpl.nasa.gov:8443/job/Data_Transformation_Visualization_Analysis/job/tva-automation-tools/job/harmony_l2ss_nightly/278/console

We need something similar from github actions

Verify_collections assumes the last group will be the dataset with lat and lon

lines 341 through 347 needs to be improved to check each group if the lat/lon variable exist. Then read in the nc group into an xarray dataset. Lat and lon var names should be defined before this code block.

with netCDF4.Dataset(subsetted_filepath) as f:
        for g in f.groups:
            ds = xarray.open_dataset(subsetted_filepath, group=g, decode_times=False)
            if len(ds.variables):
                group = g
                subsetted_ds = ds
            else:
                ds.close()

    lat_var_name, lon_var_name = get_lat_lon_var_names(subsetted_ds, collection_variables)

GitHub always shows merge conflicts when multiple PRs opened

Problem:

When multiple new associations are found and the tests pass for all of them, the first PR that gets auto-merged adds a line to the end of the associations.txt file. This causes a merge conflict for any other open PR and prevents them from getting auto-merged.

Analysis:

Git generally does not work well with append-only files. Most solutions suggest using a custom merge driver

https://stackoverflow.com/q/11841127
https://stackoverflow.com/q/34795604

However, it is not possible to have github.com use a custom merge driver for auto-merging. We need some mechanism that avoids merge conflicts if we want to continue using the "auto-merge on tests pass" strategy.

A few options were discussed:

  • Try adding a line at the end of the file (e.g. '# End of file') and always insert the concept ids before that line (trying to avoid the merge conflict). This did not work because git still treats this as a merge conflict
  • Try prepending to the file instead of appending. This did not work because git still sees it as a merge conflict
  • Use a rebase action to rebase all open PRs after pushes to main. This still did not work because the rebase failed with merge conflict
  • Change to committing individual files per collection instead of using a single file that gets appended to. This avoids merge conflicts because each PR is committing a distinct file.

Add test to recheck all already associated collections every 3 days

Add nightly test to recheck all already associated collections to make sure that all current associations are still passing. This helps to make sure collections are still in good health.

Also, report any failed collections to Slack and open a new PR in failed_test state.

Regression test for UAT ISSUES

FAILED:
C1215720343-GES_DISC (OMTO3)
C1242387586-POCLOUD (SWOT_SIMULATED_L2_KARIN_SSH_GLORYS_CALVAL_V1)
C1234666479-GES_DISC (ML2N2O)
C1234666477-GES_DISC (ML2IWC)
C1215720341-GES_DISC (OMSO2)
C1234666455-GES_DISC (ML2CO)
C1234666453-GES_DISC (ML2CH3OH)
C1234666340-GES_DISC (ML2CH3CL)
C1215720123-GES_DISC (OMCLDRR)
C1234666465-GES_DISC (ML2HCN)
C1234666467-GES_DISC (ML2HNO3)
C1229246436-GES_DISC (S5P_L2__SO2____HiR)
C1215720126-GES_DISC (OMNO2)
C1242387601-POCLOUD (SWOT_SIMULATED_L2_KARIN_SSH_ECCO_LLC4320_CALVAL_V1)
C1234666374-GES_DISC (ML2T)
C1215720348-GES_DISC (OMAERO)
C1242387624-POCLOUD (SWOT_SIMULATED_L2_NADIR_SSH_GLORYS_SCIENCE_V1)
C1234666373-GES_DISC (ML2SO2)
C1242387620-POCLOUD (SWOT_SIMULATED_L2_NADIR_SSH_ECCO_LLC4320_CALVAL_V1)
C1234666476-GES_DISC (ML2HOCL)
C1242387602-POCLOUD (SWOT_SIMULATED_L2_NADIR_SSH_GLORYS_CALVAL_V1)
C1234666469-GES_DISC (ML2O3)
C1215720346-GES_DISC (OMCLDO2)
C1215720125-GES_DISC (OMHCHO)
C1215720117-GES_DISC (OMAERUV)
C1234666333-GES_DISC (ML2BRO)
C1234666468-GES_DISC (ML2HO2)
C1215720121-GES_DISC (OMO3PR)
C1234666464-GES_DISC (ML2HCL)
C1234666359-GES_DISC (ML2GPH)
C1220280433-GES_DISC (S5P_L2__NO2___)
C1215715440-GES_DISC (OMPIXCOR)
C1220280439-GES_DISC (S5P_L2__CH4___)
C1234666458-GES_DISC (ML2H2O)
C1234666344-GES_DISC (ML2CH3CN)
C1242387621-POCLOUD (SWOT_SIMULATED_L2_KARIN_SSH_GLORYS_SCIENCE_V1)
C1242387592-POCLOUD (SWOT_SIMULATED_L2_KARIN_SSH_ECCO_LLC4320_SCIENCE_V1)
C1234666454-GES_DISC (ML2CLO)
C1215720349-GES_DISC (OMDOAO3)
C1242387600-POCLOUD (SWOT_SIMULATED_L2_NADIR_SSH_ECCO_LLC4320_SCIENCE_V1)
C1234666372-GES_DISC (ML2RHI)
C1215720327-GES_DISC (OMOCLO)
SKIPPED:
C1240560227-GES_DISC (GPM_2AGPROFAQUAAMSRE_CLIM)
C1229246433-GES_DISC ()
C1254854962-LARC_CLOUD (TEMPO_O3TOT_L2)
C1240560242-GES_DISC (GPM_2AGPROFGPMGMI)
C1229246439-GES_DISC (S5P_L2__AER_LH_HiR)
C1261459666-OB_CLOUD ()
C1229246440-GES_DISC ()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.