Giter Club home page Giter Club logo

cwl-upgrader's Introduction

Common Workflow Language

Main website: https://www.commonwl.org

GitHub repository for www.commonwl.org: https://www.github.com/common-workflow-language/cwl-website

CWL v1.0.x: https://github.com/common-workflow-language/common-workflow-language (this repository)

CWL v1.1.x: https://github.com/common-workflow-language/cwl-v1.1/

CWL v1.2.x: https://github.com/common-workflow-language/cwl-v1.2/

Support Gitter GitHub stars

[Video] Common Workflow Language explained in 64 seconds The Common Workflow Language (CWL) is a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a
variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.

Open Stand badge CWL is developed by a multi-vendor working group consisting of organizations and individuals aiming to enable scientists to share data analysis workflows. The CWL project is maintained on Github and we follow the Open-Stand.org principles for collaborative open standards development. Legally, CWL is a member project of Software Freedom Conservancy and is formally managed by the elected CWL leadership team, however every-day project decisions are made by the CWL community which is open for participation by anyone.

CWL builds on technologies such as JSON-LD for data modeling and Docker for portable runtime environments.

User Guide

The CWL user guide provides a gentle introduction to learning how to write CWL command line tool and workflow descriptions.

CWLの日本語での解説ドキュメント is a 15 minute introduction to the CWL project in Japanese.

CWL Recommended Practices

CWLの日本語での解説ドキュメント is a 15 minute introduction to the CWL project in Japanese.

A series of video lessons about CWL is available in Russian as part of the Управление вычислениями(Computation Management) free online course.

Citation

To reference the CWL project in a scholary work, please use the following citation:

Michael R. Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reyes, Bogdan Gavrilović, Carole Goble, and The CWL Community. (2022): Methods Included: Standardizing Computational Reuse and Portability with the Common Workflow Language. Commun. ACM 65, 6 (June 2022), 54–63. https://doi.org/10.1145/3486897

To cite version 1.0 of the CWL standards specifically, please use the following citation inclusive of the DOI.

Peter Amstutz, Michael R. Crusoe, Nebojša Tijanić (editors), Brad Chapman, John Chilton, Michael Heuer, Andrey Kartashov, Dan Leehr, Hervé Ménager, Maya Nedeljkovich, Matt Scales, Stian Soiland-Reyes, Luka Stojanovic (2016): Common Workflow Language, v1.0. Specification, Common Workflow Language working group. https://w3id.org/cwl/v1.0/ doi:10.6084/m9.figshare.3115156.v2

A collection of existing references to CWL can be found at https://zotero.org/groups/cwl

Code of Conduct

The CWL Project is dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, age, race, or religion. We do not tolerate harassment of participants in any form. This code of conduct applies to all CWL Project spaces, including the Google Group, the Gitter chat room, the Google Hangouts chats, both online and off. Anyone who violates this code of conduct may be sanctioned or expelled from these spaces at the discretion of the leadership team.

For more details, see our Code of Conduct.

For the following content:

  • Support, Community and Contributing
  • CWL Implementations
  • Repositories of CWL Tools and Workflows
  • Software for working with CWL
    • Editors and viewers
    • Utilities
    • Converters and code generators
    • Code libraries
  • Projects the CWL community is participating in
  • Participating Organizations
  • Individual Contributors
  • CWL Advisors
  • CWL Leadership team

Please see https://www.commonwl.org

cwl-upgrader's People

Contributors

alaindomissy avatar dependabot-preview[bot] avatar dependabot[bot] avatar mr-c avatar tetron avatar tom-tan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cwl-upgrader's Issues

cwl-runner validate fails after upgrade

The files are all existing and in correct folder structure after I moved them. Now when I run cwl-runner --validate I get this error below and org.w3id.cwl.cwl.File is no where in this yml file.

Tool definition file:///Users/****/****/UKBB/analysis-workflows/definitions_v1.1/tools/vep.cwl failed validation:
definitions_v1.1/tools/vep.cwl:5:1:                       checking field `requirements`
...more lines
...more lines
definitions_v1.1/types/vep_custom_annotation.yml:20:11:                   Field `type` references
                                                                          unknown identifier `org.w3id.cwl.cwl.File`,
                                                                          tried
                                                                          file:///Users/brian/Bolton/UKBB/analysis-workflows/definitions_v1.1/types/vep_custom_annotation.yml#vep_custom_annotation/annotation/org.w3id.cwl.cwl.File,

The line is under type: File below

  annotation:
    type:
      type: record
      name: info
      fields:
        file:
          type: File
          label: 'File to be used for annotation, include index file'

However when I run cwl-runner validate on the vep.cwl tool itself it works

CommandLineTool field types become illegal after upgrading to v1.2

Starting from cwl v1.1, fields in ResourceRequirement of command line tool no longer accept 'string' type. However when upgrading cwl with cwl-upgrader, these fields are kept and the upgraded cwl cannot pass "cwltool --validate"

Example:
cwl code v1.0 from https://github.com/broadinstitute/cromwell/blob/32d5d0cbf07e46f56d3d070f457eaff0138478d5/centaur/src/main/resources/standardTestCases/cwl_resources/cwl_resources.cwl have fields

ResourceRequirement:
      coresMin: 2
      coresMax: 2
      ramMin: 7GB
      ramMax: 7GB

After upgrading these fields are not fixed, so the cwl failed with "cwltool --validate":

* the `ramMin` field is not valid because
                                         - tried long but
                                           the value `'7GB'` is not long
                                         - tried float but
                                           the value `'7GB'` is not float or double
                                         - tried Expression but
                                           value `7GB` does not contain an expression in the form
                                           $() or ${}

Also, the type changes are not mentioned in the changelog from v1.0 to v1.1

Changelog: 

[Feature Request] returning packed object instead of dumping to files

Currently upgrade_document has the following signature and always dumps the upgraded step objects into outpu_dir.

def upgrade_document(
    document: Any, v1_only: bool, v1_1_only: bool, output_dir: str, imports: Set[str]
) -> Any:

However, it makes harder to integrate other related tools such as cwl_utils.parser.load_document family.
I would be nice if upgrade_document has an option that returns a upgraded packed CWL object instead of directly writing upgraded CWL objects.

Workflow missing required 'in' and 'out' fields

I successfully used CWL to convert this file but when I try executing it I get this:

Tool definition failed validation:

E_coli_k12_dh10b.fna.partialer/prok-annotation-cheetah.cwl:7:1: Object
                                                                `E_coli_k12_dh10b.fna.partialer/prok-annotation-cheetah.cwl`
                                                                is not valid because
                                                                  tried `Workflow` but
E_coli_k12_dh10b.fna.partialer/prok-annotation-cheetah.cwl:190:1:     the `steps` field is not
                                                                      valid because
                                                                        tried array of
                                                                        <WorkflowStep> but
E_coli_k12_dh10b.fna.partialer/prok-annotation-cheetah.cwl:191:3:         item is invalid
                                                                          because
                                                                            * missing required
                                                                            field `in`
                                                                            * missing required
                                                                            field `out`
E_coli_k12_dh10b.fna.partialer/prok-annotation-cheetah.cwl:193:3:           * invalid
                                                                            field `inputs`,
                                                                            expected one of: 'id',
                                         '                                   'in', 'out',
                                                                            'requirements',
                                                                            'hints', 'label',
                                                                            'doc', 'run',
                                                                            'scatter',
                                                                            'scatterMethod'
E_coli_k12_dh10b.fna.partialer/prok-annotation-cheetah.cwl:195:3:           * invalid
                                                                            field `outputs`,
                                                                            expected one of: 'id',
                                                                            'in', 'out',
                                                                            'requirements',
                                                                            'hints', 'label',
                                                                            'doc', 'run',
                                                                            'scatter',
                                                                            'scatterMethod

Upgrade a cwl 1.2 workflow fails

Using cwl-upgrader-1.2.2

$ cwl-upgrader rsem-merge.cwl
Processing rsem-merge.cwl
Unsupported cwlVersion: v1.2
Traceback (most recent call last):
File "/Users/golharr/workspace/NGS/RNASeq/env/bin/cwl-upgrader", line 8, in
sys.exit(main())
File "/Users/golharr/workspace/NGS/RNASeq/env/lib/python3.6/site-packages/cwlupgrader/main.py", line 59, in main
return run(parse_args(args))
File "/Users/golharr/workspace/NGS/RNASeq/env/lib/python3.6/site-packages/cwlupgrader/main.py", line 80, in run
document, args.v1_only, args.v1_1_only, args.dir, imports
File "/Users/golharr/workspace/NGS/RNASeq/env/lib/python3.6/site-packages/cwlupgrader/main.py", line 118, in upgrade_document
process_imports(document, imports, inner_updater, output_dir)
UnboundLocalError: local variable 'inner_updater' referenced before assignment

I would expect this to validate the cwl 1.2 workflow and ensure nothing is wrong, since there isn't an upgrade needed.

unicode failure on Python 3.5

Some tools I converted worked fine on my machine that uses Python 3.4, but on another the same files failed with Python 3.5 with this message:

$ cwl-upgrader prodigal.cwl > prodigal.cwl.1
Traceback (most recent call last):
  File "/usr/local/bin/cwl-upgrader", line 9, in <module>
    load_entry_point('cwl-upgrader==0.2', 'console_scripts', 'cwl-upgrader')()
  File "/usr/local/lib/python3.5/dist-packages/cwlupgrader/main.py", line 12, in main
    draft3_to_v1_0(document)
  File "/usr/local/lib/python3.5/dist-packages/cwlupgrader/main.py", line 17, in draft3_to_v1_0
    _draft3_to_v1_0(document)
  File "/usr/local/lib/python3.5/dist-packages/cwlupgrader/main.py", line 40, in _draft3_to_v1_0
    setupCLTMappings(document)
  File "/usr/local/lib/python3.5/dist-packages/cwlupgrader/main.py", line 63, in setupCLTMappings
    param['type'] = shortenType(param['type'])
  File "/usr/local/lib/python3.5/dist-packages/cwlupgrader/main.py", line 73, in shortenType
    if isinstance(typeObj, (str, unicode)) or not isinstance(typeObj, Sequence):
NameError: name 'unicode' is not defined

This is one of the example files: prodigal.cwl

Re-upgrading CWL files

When I run cwl-upgrade on a set of CommandLineTools that I have in a directory, I can upgrade them one at a time and test they are okay.

I then have my workflows 1 directory above and upgrade them one at a time. During this process, the individual tools in the tools folder get upgraded/rewritten to the current working directory instead of the directory they were read from. Two things about this:

  1. Since the tools were already upgraded, I don't think they need to be rewritten to the current working directory. That pollutes the filesystem.
  2. The tools that do need to be upgraded should be rewritten to the directory they were read from.

Is it possible to provide options for both of the above, namely,

  1. Only upgrade the cwl provided as input, and not dependent cwl files
  2. Write upgraded CWL files to the same directory where they were read from; and instead of overwriting the existing CWL file, give it a new name or output to STDOUT.

remove or upgrade remaining pre-v1.0 CWL documents

~/workflows$ git grep draft-
tools/deeptools-bamcoverage.cwl:cwlVersion: 'cwl:draft-3'
workflows/lobSTR/allelotype.cwl:cwlVersion: "cwl:draft-3"
workflows/lobSTR/lobSTR-tool.cwl:cwlVersion: "cwl:draft-3"
workflows/lobSTR/lobSTR-workflow.cwl:cwlVersion: "cwl:draft-3"
workflows/lobSTR/samtools-index.cwl:cwlVersion: "cwl:draft-3"
workflows/lobSTR/samtools-sort.cwl:cwlVersion: "cwl:draft-3"
workflows/presentation-demo/README:Updated 2016-04-10 by Michael R. Crusoe for CWL draft-3
workflows/presentation-demo/filtercount.cwl:cwlVersion: "cwl:draft-3"
workflows/scidap/bam-genomecov-bigwig-rna-dutp.cwl:cwlVersion: "cwl:draft-3"
workflows/scidap/bam-genomecov-bigwig.cwl:cwlVersion: "cwl:draft-3"
workflows/scidap/custom-genome-fromVCF-alea.cwl:cwlVersion: "cwl:draft-3"
workflows/scidap/ucsc-liftover-bed.cwl:cwlVersion: "cwl:draft-3"

Imports inside the requirements block cause cwl-upgrader to fail

The $import key seems to cause this to fail. For example, here is the minimal CWL that will cause this problem:

cwlVersion: "cwl:draft-3"
class: CommandLineTool
requirements:
- $import: file.yml

The error I get is:

Traceback (most recent call last):
  File "/usr/local/bin/cwl-upgrader", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/cwlupgrader/main.py", line 24, in main
    document = draft3_to_v1_0(document)
  File "/usr/local/lib/python3.6/dist-packages/cwlupgrader/main.py", line 34, in draft3_to_v1_0
    _draft3_to_v1_0(document)
  File "/usr/local/lib/python3.6/dist-packages/cwlupgrader/main.py", line 57, in _draft3_to_v1_0
    hints_and_requirements_clean(document)
  File "/usr/local/lib/python3.6/dist-packages/cwlupgrader/main.py", line 161, in hints_and_requirements_clean
    if entry["class"] == "CreateFileRequirement":
KeyError: 'class'

long form arrays aren't being converted correctly

The previous version of one of my tools worked just fine when in draft three, but I upgraded using the utility without errors and the new version errors with this output:

/usr/local/bin/cwl-runner 1.0.20170119234115
Resolved 'E_coli_k12_dh10b.fna.partialer/prok-annotation-cheetah.cwl' to 'file:///home/jorvis/GALES/E_coli_k12_dh10b.fna.partialer/prok-annotation-cheetah.cwl'
Tool definition failed initialization:
Tool definition file:///home/jorvis/GALES/bin/../cwl/tools/attributor-prok-cheetah.cwl failed validation:
cwl/tools/attributor-prok-cheetah.cwl:3:1:   Object `cwl/tools/attributor-prok-cheetah.cwl` is not
                                             valid because
                                               tried `CommandLineTool` but
cwl/tools/attributor-prok-cheetah.cwl:254:1:       the `inputs` field is not valid because
cwl/tools/attributor-prok-cheetah.cwl:266:3:         item is invalid because
cwl/tools/attributor-prok-cheetah.cwl:268:5:           invalid field `items`, expected one
                                                       of: 'label', 'secondaryFiles', 'format',
                                                       'streamable', 'doc', 'id', 'inputBinding',
                                                       'default', 'type'
ERROR: Return code 1 when running the following command: cwl-runner --outdir E_coli_k12_dh10b.fna.partialer E_coli_k12_dh10b.fna.partialer/prok-annotation-cheetah.cwl E_coli
_k12_dh10b.fna.partialer/prok-annotation-cheetah.json

shebang line duplicated

When I update from v1.1 to v1.2 it duplicates the shebang line. This is not duplicated from v1.0 to v1.1

Expected <block end>, but found '<scalar>'

Here is the CWL in question. Then, when I try to convert it I get:

$ cwl-upgrader prok-annotation-cheetah.cwl
Traceback (most recent call last):
File "/usr/local/bin/cwl-upgrader", line 11, in
load_entry_point('cwl-upgrader==0.2.2', 'console_scripts', 'cwl-upgrader')()
File "/usr/local/lib/python3.5/dist-packages/cwlupgrader/main.py", line 13, in main
document = ruamel.yaml.round_trip_load(entry)
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/main.py", line 123, in round_trip_load
return load(stream, RoundTripLoader, version, preserve_quotes=preserve_quotes)
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/main.py", line 81, in load
return loader.get_single_data()
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/constructor.py", line 55, in get_single_data
node = self.get_single_node()
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/composer.py", line 50, in get_single_node
document = self.compose_document()
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/composer.py", line 70, in compose_document
node = self.compose_node(None, None)
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/composer.py", line 105, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/composer.py", line 164, in compose_mapping_node
item_value = self.compose_node(node, item_key)
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/composer.py", line 103, in compose_node
node = self.compose_sequence_node(anchor)
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/composer.py", line 134, in compose_sequence_node
node.value.append(self.compose_node(node, index))
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/composer.py", line 105, in compose_node
node = self.compose_mapping_node(anchor)
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/composer.py", line 157, in compose_mapping_node
while not self.check_event(MappingEndEvent):
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/parser.py", line 116, in check_event
self.current_event = self.state()
File "/usr/local/lib/python3.5/dist-packages/ruamel/yaml/parser.py", line 505, in parse_block_mapping_key
token.start_mark)
ruamel.yaml.parser.ParserError: while parsing a block mapping
in "prok-annotation-cheetah.cwl", line 182, column 5
expected , but found ''
in "prok-annotation-cheetah.cwl", line 183, column 27

Unifying target version parameter for `upgrade_document`

The current implementation of upgrade_document has the following signature:

def upgrade_document(
    document: Any, v1_only: bool, v1_1_only: bool, output_dir: str, imports: Set[str]
) -> Any:

IMO it is not scalable for future releases because the new parameter should be added for each release version.

How about fixing the signature as follows?

def upgrade_document(
    document: Any, target_version: Optional[str] = "latest", output_dir: str, imports: Set[str]
) -> Any:

The target_version parameter takes a version string such as "v1.0", "v1.1", "v1.2" and "latest" (same as "v1.2").
It enables upgrading upgrade_document to other future release versions without changing its signature.

If it is OK for this change, I will send a pull request for it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.