Giter Club home page Giter Club logo

maap-ci-stage-io's Introduction

maap-ci-stage-io

A bootstrapping repository for running the stage-in and stage-out steps for MAAP CI/CD build. This README also contains helpful information and tutorials on how to debug CWL files.

How to Debug CWL Files

In order to debug the opaque features of CWL, our strategy should be to inject Javascript into the file. Take the following official CWL code as an example:

cwlVersion: v1.0
class: CommandLineTool
baseCommand: cat

hints:
  DockerRequirement:
    dockerPull: alpine

inputs:
  in1:
    type: File
    inputBinding:
      position: 1
      # Asks CWL to submit this objects 'basename' field as an argument to the baseCommand
      valueFrom: $(self.basename)

requirements:
  InitialWorkDirRequirement:
    listing:
      - $(inputs.in1)

outputs:
  out1: stdout

Problem

If you copy this example as-is and adapt it to call a script who expects a filename as an argument, there is a good chance that this will fail and you'll recieve a "file does not exist" error instead, even though CWL successfully detected and mounted the file.

Why does this happen? It is because self.basename is a Javascript expression accessing the basename field of the self object corresponding to in1. Meanwhile, basename refers to the filename passed in to in1. It does not include the directory within the docker container that the file is mounted to! This means that if your docker container uses a working directory which is different from the folder where the file in1 is mounted to, then using only the filename is completely useless and your script will be incapable of finding it!

Solution

To diagnose this issue using Javascript, we need to add the following line to the requirements dictionary within the CWL:

requirements:
  InlineJavascriptRequirement: {}

This will allow CWL to run Javascript expressions using the ${} expression (notice these are curly brackets, not parentheses). Now we can replace $(self.basename) with the following:

baseCommand: echo
inputs:
  in1:
    type: File
    inputBinding:
      position: 1
      # Defines a Javascript function whose return value will be used as the argument submitted to the baseCommand
      valueFrom: ${ return "\"" + JSON.stringify(self) + "\"; }
# other stuff...

By doing this, you will get to see the full object associated with in1 as a string. As an example, we would see this string appear when the baseCommand gets run (prettified for the purposes of this README):

{
  "class": "File",
  "location": "file:///data/home/hysdsops/zhan/artifact-deposit-repo/jplzhan/gedi-subset/main/geoBoundaries-GAB-ADM0.geojson",
  "size": 51350,
  "basename": "geoBoundaries-GAB-ADM0.geojson",
  "nameroot": "geoBoundaries-GAB-ADM0",
  "nameext": ".geojson",
  "path": "/PpwdnD/geoBoundaries-GAB-ADM0.geojson",
  "dirname": "/PpwdnD"
}

Notice here that there are many more fields than just basename. In short, we can guess that self.dirname refers to the directory within the docker image the CWL file is mounted into (and you can test that it is), and self.path is the absolute path to the mounted file within the docker image. In addition, you can use other parameters like self.size and self.nameext to perform operations relating to the size and type of file being used.

This information corresponds to Fields table described under File type inputs in the official documenation, but using Javascript allows us to debug this information in an actual setting and identify serious problems in the examples provided by the CWL tutorials.

Namely, we learn things like the fact that this is line is non-essential and can be removed because the file is accessible by $(self.path) alone:

  InitialWorkDirRequirement:
    listing:
      - $(inputs.in1)

maap-ci-stage-io's People

Contributors

jplzhan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.