yuch7 / cwlexec Goto Github PK

View Code? Open in Web Editor NEW

36.0 36.0 8.0 24.54 MB

A new open source tool to run CWL workflows on LSF

License: Other

Shell 2.39% Java 79.36% Common Workflow Language 14.44% JavaScript 3.65% C 0.01% Python 0.15%

cwl java lsf workflow

cwlexec's People

Contributors

Stargazers

Watchers

Forkers

drjrm3 davisjam qiangjia skeeey drkennetz liangasdfgmail vindamle thewitness

cwlexec's Issues

Incorrect Null evaluation of file

#37 seems to be resolved now with my minimal example but when I ran my larger workflow again it still failed. After dissecting each piece I found that an unrelated parameter seems to be causing this same error.

When I change the example CLT / WF from #37 and allow the CLT to have an optional boolean and then edit the WF to interpret the boolean as z: {valueFrom: $(true)} I get the same std error I did in #37. However, the new z parameter is (I think) completely unrelated to the parameter involved with the std error.

UndefinedVariableError2.5.tar.gz

Failed to "Fill out the scatter gather result in the script"

command (just the usual):

cwlexec -p --workdir /home/user/<username>/output/ TranscriptsAnnotation-i5only-wf.cwl TranscriptsAnnotation-i5only-wf.test.job.yaml

cwlexec fails at the scattered functionalAnalysis step and reports the following:

[15:49:15.857] INFO  - The step (functionalAnalysis/runInterproscan) scatter of 1 jobs.
[15:49:15.857] INFO  - Started job (functionalAnalysis/runInterproscan_1) with
bsub \
-cwd \
/home/user/maxim/output/a286910d-d3e3-41a4-b707-1e0a7654e4d4/functionalAnalysis/runInterproscan/scatter1 \
-o \
%J_out \
-e \
%J_err \
-env \
all,TMPDIR=/home/user/maxim/output/a286910d-d3e3-41a4-b707-1e0a7654e4d4 \
-R \
mem > 8192 \
-n \
3 \
/bin/sh -c 'interproscan.sh --outfile /home/user/maxim/output/a286910d-d3e3-41a4-b707-1e0a7654e4d4/functionalAnalysis/runInterproscan/transcript-01.p2_transcript-01.p2.i5_annotations --disable-precalc --goterms --pathways --tempdir /home/user/maxim/output/a286910d-d3e3-41a4-b707-1e0a7654e4d4/functionalAnalysis/runInterproscan --input /home/user/maxim/output/a286910d-d3e3-41a4-b707-1e0a7654e4d4/functionalAnalysis/splitSeqs/transcript-01.p2_transcript-01.p2.fasta --applications PfamA --formats TSV'
[15:49:15.877] INFO  - Job (functionalAnalysis/runInterproscan_1) was submitted. Job <1421> is submitted to default queue <normal>.
[15:49:15.877] INFO  - Started to wait for jobs by
bwait \
-w \
done(1421)
[15:50:07.854] INFO  - Fill out the scatter gather result in the script /home/user/maxim/output/a286910d-d3e3-41a4-b707-1e0a7654e4d4/functionalAnalysis/runInterproscan/functionalAnalysis/runInterproscan
[15:50:07.855] ERROR - Failed to wait for job functionalAnalysis/runInterproscan <1415>, Failed to write file "/home/user/maxim/output/a286910d-d3e3-41a4-b707-1e0a7654e4d4/functionalAnalysis/runInterproscan/functionalAnalysis/runInterproscan": /home/user/maxim/output/a286910d-d3e3-41a4-b707-1e0a7654e4d4/functionalAnalysis/runInterproscan/functionalAnalysis/runInterproscan (No such file or directory)
[15:50:07.855] ERROR - The workflow (TranscriptsAnnotation-i5only-wf) exited with <255>.
[15:50:07.855] WARN  - killing waiting job (functionalAnalysis/runInterproscan) <1415>.
[15:50:07.855] WARN  - killing waiting job (functionalAnalysis/combineResults) <1418>.

Fail to write scatter values upon scattering on files

We have a 3 step pipeline (map, foo, reduce) where map creates N files, foo transforms a file into another file, and reduce cats all files into one. When we scatter with CWLEXEC on foo, it only peforms foo on one file and delivers it (with success) to reduce even though there is a Java error involved:

16:28:26.595 default [pool-4-thread-2] ERROR c.i.s.c.e.u.outputs.OutputsCapturer - Fail to write scatter values

java.nio.file.FileAlreadyExistsException: /home/jmichael/CWLEXEC/FailToWriteScatterValuesError/workdir/a5efe1f2-2a7d-42d9-906b-049c154ffff2/foo/1.foo.txt
    at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
    at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
    at java.nio.file.Files.newByteChannel(Files.java:361)
    at java.nio.file.Files.createFile(Files.java:632)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.writeScatterValues(OutputsCapturer.java:459)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.findScatterOutputValue(OutputsCapturer.java:444)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.findScatterOuputValue(OutputsCapturer.java:262)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.captureCommandOutputsByType(OutputsCapturer.java:181)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.captureCommandOutputs(OutputsCapturer.java:94)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.captureStepOutputs(LSFBwaitExecutorTask.java:373)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.makeStepSuccessful(LSFBwaitExecutorTask.java:142)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.waitSteps(LSFBwaitExecutorTask.java:132)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.run(LSFBwaitExecutorTask.java:97)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

FailToWriteScatterValuesError.tar.gz

SchemaDefRequirement section: Imports of type definitions are not supported

cwlexec reports the following and exits:

The field [SchemaType] is required by [type].

if a type definition is imported like that:

cwlVersion: v1.0
class: CommandLineTool

requirements:
 - class: SchemaDefRequirement
   types:
    - $import: test_values.yaml

here is a simple test case for reproduction:
test.cwl.txt
test.yaml.txt
test_values.yaml.txt

call:

$ cwlexec test.cwl test.yaml

Fails to run when input CWLType is an array of File, Directory

Hi,

We have a simple tool (attached) that performs echo on an input, and accepts either a File or Directory. When we run it, it returns the error too many types for one paramter . I've tested this with an array that is [File, string] or [Directory, string] or [string, int] and it seems to work, but the combination of [File, Directory] throws this error.

TooManyTypesError.tar.gz

Clarification: Why not use "bsub -w"?

My understanding

As far as I can tell from perusal of the cwlexec source and the description of its behavior here in the README:

cwlexec obtains job dependency information from the CWL input file.
it then submits one job for each stage in parallel (bsub).
some kind of "job wrapper" then waits for any upstream dependencies to finish (bwait) and actually starts the user's job once all dependencies are completed (bresume).

(If I misunderstand, please correct me!).

My question

LSF has built-in job dependency monitoring via bsub -w. Why does cwlexec dynamically monitor dependency states instead of offloading the job to LSF?

As a note, this would have the side effect of permitting reasoning about the CWL job from the LSF side using bjdepinfo, which might be useful in its own right. Unless bjdepinfo already tracks dependencies listed by bwait -- does it?

Unable to evaluate record types in InitialWorkDirRequirement

Hi,

We have a simple command line tool (attached) that takes a record type as input with two strings, one for a file name and one for a directory name. Using InitialWorkDirRequirement should set up the directory for use by the command, but it fails to evaluate the record in this context.

At line 202 in attached outfile.txt:
09:23:29.529 default [main] DEBUG c.i.s.c.e.util.evaluator.JSEvaluator - Evaluated js expression "$(inputs.parameters.out_dir)" to A null object

However, it is able to parse the record properly for creating the base command, just not for the above step.
InitialWorkDirError.tar.gz

Unable to scatter over multiple arrays using a scatterMethod

I am trying to scatter over 2 arrays of the same size using scatterMethod: flat_crossproduct and CWLEXEC fails with:

com.ibm.spectrumcomputing.cwl.model.process.parameter.type.file.CWLFile cannot be cast to java.lang.CharSequence

This seems to happen when using any scatterMethod. A failed run with debug info is attached.

CWLFileCastError.tar.gz

Cannot use Array type with inputs

For tools which require a flag before each item in an array we can use the methods described in The array-inputs tutorial. This works well for cwltool, but does not seem to get passed to the baseCommand in cwlexec.

Attached is an example command 'foo' which takes multiple --INPUT files and cats them all to a single --OUTPUT file. With cwltool it works but with cwlexec it does not pass any of the --INPUT flags.

MultiInputError.tar.gz

Complete support for CWL 1.0

Complete support for CWL 1.0 as seen by the CWL Conformance test suite for cwlexec - https://ci.commonwl.org/job/cwlexec/

Cannot support to run cwlexec command concurrently by one user

The cwlexec command cannot be executed concurrently by one user, this because cwlexec use HyperSQL (file: database) to record the workflow execution information. (the db file is in the $HOME/.cwlexec by default) , but HyperSQL file model doesn't support to write db concurrently. more information can be found from: http://hsqldb.org/doc/2.0/guide/running-chapt.html#N100CF

Works with Jsrun?

Hello,
Could you say a little about how this works with jsrun? I am working on the Summit supercomputer at ORNL. Has anyone run this on Summit?

Thanks.

Quotes not recognized in baseCommand

We have a simple workflow which uses baseCommand: [awk, '{print $2}'] and the interpreted command does not keep the ''s. Instead, it interprets the baseCommand as "baseCommand" : [ "awk", "{print $2}" ], (line 40 in the attached outfile.txt) and attempts to execute awk {print $2} (line 148) which fails.
BaseCommandError.tar.gz

Optional array workflow input does not use tool's default value

Hi,

We have a simple example workflow (foo_wf.cwl) that takes an optional string array as input. It calls foo.cwl which sets a default for the input array. This workflow works when given the string array as input, but when given no inputs it throws the following error in errfile.txt:

com.ibm.spectrumcomputing.cwl.model.process.parameter.type.NullValue cannot be cast to java.util.List

This only happens at the workflow level with the optional array input. I believe since it is optional for the workflow, it should return null as input to the step calling foo.cwl where it then uses the default array for the command line tool input because the input is null. This is the behavior for a non-array optional input.

OptionalArrayInputError.tar.gz

cwltool and cwlexec not scattering jobs in same way

I have a simple workflow in which I have 2 inputs, one is type: File, and the other is an array of files. I want to run a command in which each of the files in the array are used against the single input file from input1.
When I run cwltool it performs as expected. It runs in serial each file in the array against the single input1.
When I run cwlexec it gives me no errors or message, and the job immediately terminates.

the tool is:


cwlVersion: v1.0
class: CommandLineTool

hints:
  SoftwareRequirement:
    packages:
      bedtools:
        version: [ "2.25.0" ]

inputs:
  outputGenomeCov:
    type: File
    inputBinding:
      position: 1
      prefix: -a

  regionsBedFile:
    type: File
    inputBinding:
      position: 2
      prefix: -b

  allPositions:
    type: string
    default: "-c"
    inputBinding:
      position: 3
      prefix: -c

outputs:
  allDepthOutput:
    type: File
    outputBinding: {glob: $(inputs.regionsBedFile.basename)_AtoB.txt}

stdout: $(inputs.regionsBedFile.basename)_AtoB.txt

baseCommand: [bedtools, intersect]

And the workflow is:


cwlVersion: v1.0
class: Workflow

requirements:
 - class: ScatterFeatureRequirement

inputs:
  outputGenomeCov: File
  regionsBedFile: File[]

outputs:
  intersectAB:
    type: File[]
    outputSource: intersect/allDepthOutput

steps:
  intersect:
    run: 2_bedtoolsIntersect.cwl
    scatter: regionsBedFile
    in:
      outputGenomeCov: outputGenomeCov
      regionsBedFile: regionsBedFile
    out: [allDepthOutput]

The .yml file:

outputGenomeCov:
  class: File
  path: /path/to/input.txt

regionsBedFile:
 - {class: File, path: /path/to/bedfile1.bed}
 - {class: File, path: /path/to/bedfile2.bed}
 - {class: File, path: /path/to/bedfile3.bed}

I have another workflow that has designated output files created from the program, but bedtools prints its outputs to stdout. The other workflow works well, but this fails.
Thanks,
Dennis

Undefined javascript variable error

Working on #34 again and I can now reproduce the same error in my larger workflow.

Step 1 is split_reads which correctly scatters over the files now after the workaround proposed.

Step 2 scatters over those files and tries to generate a string for output_file from one of the files generated as part of step 1. I am using the following JavaScript to generate this string and it works in cwltool so I had assumed it was the correct approach:

      output_file:
        valueFrom: |
          ${  
            var s = inputs.R1_file.nameroot;
            s = s.replace(".R1","");
            return s + ".out";
          }

However, the error I get with cwlexec is:

[var runtime={"tmpdir":"/home/jmichael/cwl-workdir/79cb5eaa-3438-497f-8be8-85fd9a5523c7","tmpdirSize":"15005232752754688","outdirSize":"15005232752754688","cores":"1","outdir":"/home/jmichael/cwl-workdir/79cb5eaa-3438-497f-8be8-85fd9a5523c7","ram":"1024"};, var inputs={"R
12:20:50.855 default [pool-5-thread-1] ERROR c.i.s.c.e.e.lsf.LSFBwaitExecutorTask - Failed to wait for job process_reads <66124407>, Failed to evaluate the expression "${
  var s = inputs.R1_file.nameroot;
  s = s.replace(".R1","");
  return s + ".out";
}
": TypeError: Cannot read property "replace" from undefined in <eval> at line number 3

so it looks like it is not correctly using inputs.R1_file. Am I using the correct approach here? It appears to be the same general issues as in #34 but I don't know that I can use the same workaround since I don't take anything in the inputs section in the scatter so I can't use that as source.

UndefinedVariableError.tar.gz

cwlexec doesn't support inputs of type Array<enum>

cwlexec reports the following and exits:
The variable type of the field [type] is not valid, "a valid CWL type" is required.

if the input port is defined like that:

inputs:
  - id: applications
    type:
      type: array
      items:
        type: enum
        name: applications
        symbols:
          - PfamA
          - TIGRFAM

on the other hand cwltool & cwl-runner is accepting those type definitions.

basename is not recognized in JS evaluation of directory

The Directory specification requires a basename attribute but this is currently being evaluated as a null object by cwlexec since it is not included in the fields.

The attached shows a simple example of attempting to evaluate a basename of a directory where all required fields except basename are included so cwlexec fails:

09:50:13.340 default [pool-4-thread-1] DEBUG c.i.s.c.e.util.evaluator.JSEvaluator - Evaluate js expression "$(inputs.out_dir.basename)" with context
[var inputs={"sample":"MySample","out_dir":{"location":"/research/rgs01/home/clusterHome/kbrown1/DirectoryBasenameError/outdir/MySample","path":"/home/kbrown1/DirectoryBasenameError/workdir/d9623834-8552-4554-b8b5-c58184d22730/MySample","srcPath":"/research/rgs01/home/clusterHome/kbrown1/DirectoryBasenameError/outdir/MySample","listing":[],"class":"Directory"}};]
09:50:13.353 default [pool-4-thread-1] DEBUG c.i.s.c.e.util.evaluator.JSEvaluator - Evaluated js expression "$(inputs.out_dir.basename)" to A null object
09:50:13.353 default [pool-4-thread-1] ERROR c.i.s.c.e.e.lsf.LSFBwaitExecutorTask - Failed to wait for job touch_sample <42464255>, null
09:50:13.354 default [pool-4-thread-1] ERROR c.i.s.c.e.e.lsf.LSFBwaitExecutorTask - The exception stacks:
java.lang.NullPointerException: null
    at java.lang.String.replace(String.java:2240)
    at com.ibm.spectrumcomputing.cwl.exec.util.evaluator.JSEvaluator.parsePlaceholder(JSEvaluator.java:136)
    at com.ibm.spectrumcomputing.cwl.exec.util.evaluator.JSEvaluator.parseExpr(JSEvaluator.java:171)
    at com.ibm.spectrumcomputing.cwl.exec.util.evaluator.JSEvaluator.evaluate(JSEvaluator.java:56)
    at com.ibm.spectrumcomputing.cwl.exec.util.evaluator.CommandOutputBindingEvaluator.evalGlob(CommandOutputBindingEvaluator.java:64)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.captureCommandOutputs(OutputsCapturer.java:92)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.captureStepOutputs(LSFBwaitExecutorTask.java:376)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.makeStepSuccessful(LSFBwaitExecutorTask.java:140)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.waitSteps(LSFBwaitExecutorTask.java:133)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.run(LSFBwaitExecutorTask.java:98)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

DirectoryBasenameError.tar.gz

coresMin used instead of ramMin

Hi,

We were testing using the ResourceRequirement field in the CWL document and noticed that when using ramMin the sub command submits -R mem>coresMin

Looks like the error is likely simply fixed here by replacing coresMin with ramMin: https://github.com/IBMSpectrumComputing/cwlexec/blob/e3c19121ac9ec8db24f09c542931345a43bb4ef0/src/main/java/com/ibm/spectrumcomputing/cwl/exec/service/CWLLSFCommandServiceImpl.java#L177

Attached is a test case. Even though ramMin is set to 100, it uses coresMin (either the given value or null if it is not given where it produces an error).

ramMinError.tar.gz

Empty JS array

Following up on #38 I find that a new error is produced in my actual workflow which I have now duplicated here. Specifically, it looks like the JS interpreter is passing values as [] and so a java IndexOutOfBoundsException error is thrown when trying to evaluate the contents of the array. See lines 1739 and 1742 below (contained in cwlexec.out):

1739 12:41:59.517 default [pool-5-thread-1] DEBUG c.i.s.c.e.util.command.CommandUtil - Prepare input (id=output_file, type=string, value=[]) for process_reads
1740 12:41:59.517 default [pool-5-thread-1] DEBUG c.i.s.c.e.util.command.CommandUtil - Prepare input (id=threads, type=null, value=2) for process_reads
1741 12:41:59.517 default [pool-5-thread-1] DEBUG c.i.s.c.e.util.command.CommandUtil - Prepare input (id=K, type=null, value=NULL) for process_reads
1742 12:41:59.517 default [pool-5-thread-1] DEBUG c.i.s.c.e.util.command.CommandUtil - Prepare input (id=Y, type=null, value=[]) for process_reads
1743 12:41:59.517 default [pool-5-thread-1] DEBUG c.i.s.c.e.util.command.CommandUtil - Prepare input (id=k, type=null, value=NULL) for process_reads
1744 12:41:59.517 default [pool-5-thread-1] DEBUG c.i.s.c.e.util.command.CommandUtil - Prepare input (id=M, type=null, value=NULL) for process_reads
1745 12:41:59.517 default [pool-5-thread-1] DEBUG c.i.s.c.e.util.command.CommandUtil - Prepare input (id=R, type=null, value=NULL) for process_reads
1746 12:41:59.517 default [pool-5-thread-1] DEBUG c.i.s.c.e.util.command.CommandUtil - Prepare input (id=I, type=null, value=NULL) for process_reads
1747 12:41:59.518 default [pool-5-thread-1] DEBUG c.i.s.c.e.util.command.CommandUtil - Prepare input (id=fastq, type=File, value=File:/home/jmichael/cwl-workdir/b388f356-940f-4daf-9a99-c722488fc0d7/split_reads/scatter1/input1_R1.fastq.gz) for process_reads

UndefinedVariableError3.tar.gz

Regression after June 7th +29 CWL conformance tests fail

On June 7th we ran the CWL conformance tests against 4ea1396 and there were 20 failures (same as before)

Today we ran the CWL conformance tests against the latest code 023b1b5 and there are 48 failures (28 more)

https://ci.commonwl.org/job/cwlexec/96/console

Newly failed tests

Default queue is used in a workflow even though another queue is specified in a config file

I have a simple foo.sh script which is wrapped with foo.cwl. foo_wf.cwl is a workflow which scatters over foo.cwl. When specifying a queue in a config file, all scatter jobs correctly hit that queue, but the final Scatter gather job action is sent to my default queue, not the queue specified in my config file.

See attached for a fully reproducible example (aside from queues 'priority' and 'short' being specified).

WrongQueueError.tar.gz

CWLEXEC doesn't correctly run several subworkflows

Hi!
I met am issue when tested running several subworkflows in CWLEXEC. My pipeline works fine in CWL (cwltool) but fails in CWLEXEC.
The structure of pipeline is very simple:
step 1:
-- subworkflow 1:
------ copy file from input to another fille
step 2:
-- subworkflow 2:
------ grep the output of step 1 (by condition), output is stdout
------ copy result to another file

The error is:

------------------------------------------------------------
Successfully completed.
Resource usage summary:
    CPU time :                                   0.02 sec.
    Max Memory :                                 -
    Average Memory :                             -
    Total Requested Memory :                     -
    Delta Memory :                               -
    Max Swap :                                   -
    Max Processes :                              -
    Max Threads :                                -
    Run time :                                   7 sec.
    Turnaround time :                            1 sec.
The output (if any) is above this job summary.

[13:32:02.086] INFO  - Fill out commands in the script <path>/step-wf-2/step-subwf-1/step-wf-2_step-subwf-1:
grep 2  <command>
[13:32:02.090] INFO  - Resuming job (step-wf-2/step-subwf-1) <1896579> with
bresume \
1896579
[13:32:02.236] INFO  - Started to wait for jobs by
bwait \
-w \
done(1896579)
[13:32:04.773] INFO  - The job (step-wf-2/step-subwf-1) <1896579> is done with stdout from LSF:
------------------------------------------------------------
Job <<path>/step-wf-2/step-subwf-1/step-wf-2_step-subwf-1> was submitted from host <host> by user <user> in cluster <cluster> at Wed Sep 11 13:32:00 2019
Job was executed on host(s) <host>, in queue <queue>, as user <user> in cluster <cluster> at Wed Sep 11 13:32:03 2019
<dirr> was used as the home directory.
<path/step-wf-2/step-subwf-1> was used as the working directory.
Started at Wed Sep 11 13:32:03 2019
Terminated at Wed Sep 11 13:32:03 2019
Results reported at Wed Sep 11 13:32:03 2019
------------------------------------------------------------
# LSBATCH: User input
path/step-wf-2/step-subwf-1/step-wf-2_step-subwf-1
------------------------------------------------------------
Successfully completed.
Resource usage summary:
    CPU time :                                   0.02 sec.
    Max Memory :                                 -
    Average Memory :                             -
    Total Requested Memory :                     -
    Delta Memory :                               -
    Max Swap :                                   -
    Max Processes :                              -
    Max Threads :                                -
    Run time :                                   2 sec.
    Turnaround time :                            3 sec.
The output (if any) is above this job summary.

[13:32:04.837] ERROR - Failed to wait for job step-wf-2/step-subwf-2 <1896578>, null
[13:32:04.837] ERROR - The workflow (test-pipeline) exited with <255>.
[13:32:04.837] WARN  - killing waiting job (step-wf-2/step-subwf-2) <1896578>.

I didn't meet this problem when run steps without subworkflows. But this case is very important for me because I use similar structure with more complicated workflows and tools.

All scripts attached in archive.
for_issue.zip

Thank you!
Kate

CWLEXEC fails with Hibernate exception for workflows with more than 20 steps

We do bump into this issue whenever we try to execute workflows in CWLEXEC with more than 20 steps, so 21 steps for instance. Also sometimes CWLEXEC hangs after it has reported the error. Could it be that sessions are not closed properly after each database transaction? We created a simple test workflow so it becomes easy to reproduce for you guys.

Here is the workflow:
test-workflow.zip

This is the command we are running:

cwlexec -debug -L -p -w <work-dir> -o <output-dir> test-workflow.cwl

CWLEXEC reports the following and exits or sometimes just hangs:

17:14:04.579 default [pool-3-thread-20] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_20) was submitted. Job <6076314> is submitted to default queue <research-rh74>.
17:14:04.579 default [pool-3-thread-16] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_16) was submitted. Job <6076305> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-12] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_12) was submitted. Job <6076312> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-14] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_14) was submitted. Job <6076313> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-21] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_21) was submitted. Job <6076319> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-18] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_18) was submitted. Job <6076322> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-17] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_17) was submitted. Job <6076317> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-13] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_13) was submitted. Job <6076321> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-2] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_2) was submitted. Job <6076316> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-5] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_5) was submitted. Job <6076310> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-1] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_1) was submitted. Job <6076324> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-15] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_15) was submitted. Job <6076318> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-4] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_4) was submitted. Job <6076320> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-19] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_19) was submitted. Job <6076307> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-3] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_3) was submitted. Job <6076325> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-7] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_7) was submitted. Job <6076306> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-6] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_6) was submitted. Job <6076315> is submitted to default queue <research-rh74>.
17:14:04.580 default [pool-3-thread-11] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_11) was submitted. Job <6076309> is submitted to default queue <research-rh74>.
17:14:04.581 default [pool-3-thread-10] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_10) was submitted. Job <6076311> is submitted to default queue <research-rh74>.
17:14:04.581 default [pool-3-thread-9] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_9) was submitted. Job <6076323> is submitted to default queue <research-rh74>.
17:14:04.584 default [pool-3-thread-8] INFO  c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Job (touch_8) was submitted. Job <6076308> is submitted to default queue <research-rh74>.
17:14:04.737 default [pool-3-thread-6] ERROR c.i.s.c.e.e.lsf.LSFBsubExecutorTask - Failed to submit the step touch_6, The internal connection pool has reached its maximum size and no connection is currently available!
17:14:04.743 default [pool-3-thread-6] ERROR c.i.s.c.e.e.lsf.LSFBsubExecutorTask - The exception stacks:
org.hibernate.HibernateException: The internal connection pool has reached its maximum size and no connection is currently available!
	at org.hibernate.engine.jdbc.connections.internal.PooledConnections.poll(PooledConnections.java:82)
	at org.hibernate.engine.jdbc.connections.internal.DriverManagerConnectionProviderImpl.getConnection(DriverManagerConnectionProviderImpl.java:186)
	at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:35)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:106)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:136)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getConnectionForTransactionManagement(LogicalConnectionManagedImpl.java:254)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.begin(LogicalConnectionManagedImpl.java:262)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.begin(JdbcResourceLocalTransactionCoordinatorImpl.java:214)
	at org.hibernate.engine.transaction.internal.TransactionImpl.begin(TransactionImpl.java:56)
	at org.hibernate.internal.AbstractSharedSessionContract.beginTransaction(AbstractSharedSessionContract.java:409)
	at com.ibm.spectrumcomputing.cwl.exec.service.CWLInstanceService.updateCWLProcessInstance(CWLInstanceService.java:83)
	at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBsubExecutorTask.runStep(LSFBsubExecutorTask.java:108)
	at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBsubExecutorTask.run(LSFBsubExecutorTask.java:56)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
17:14:04.744 default [pool-3-thread-6] DEBUG c.i.s.c.e.e.lsf.LSFWorkflowRunner - broadcast event EXIT, touch_6
17:14:04.746 default [pool-3-thread-6] ERROR c.i.s.c.e.e.lsf.LSFWorkflowRunner - The workflow (test-wf) exited with <255>.

ExpressionTool cannot return multiple arrays

Here is a simple CWL script "int_to_array.cwl" to convert an int to an int array and a string array:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: ExpressionTool

requirements:

class: InlineJavascriptRequirement

inputs:
number:
type: int
label: a positive integer

outputs:
int_array:
type: int[]
str_array:
type: string[]

expression: |
${ var s_arr = [], i_arr = [];
for (var i = 0; i < inputs.number; i++) {
s_arr.push('hello' + i + '.txt');
i_arr.push(i);
}
return { "int_array": i_arr, "str_array": s_arr };
}

This works with cwltool. But does not work with cwlexec-0.2.2:
$ cwltool int_to_array.cwl int_to_array.yml
/research/rgs01/project_space/yu3grp/software_JY/yu3grp/conda_env/yulab_env/bin/cwltool 1.0.20190228155703
Resolved 'int_to_array.cwl' to 'file:///research/rgs01/home/clusterHome/lding/develop/cwl/practices/expression/int_to_array.cwl'
{
"int_array": [
0,
1,
2,
3
],
"str_array": [
"hello0.txt",
"hello1.txt",
"hello2.txt",
"hello3.txt"
]
}
Final process status is success

$ cwlexec int_to_array.cwl int_to_array.yml
[17:24:24.592] INFO - Workflow ID: 20fed44a-9f27-4797-b886-28846559711f
[17:24:24.593] INFO - Name: int_to_array
[17:24:24.593] INFO - Description file path: /research/rgs01/home/clusterHome/lding/develop/cwl/practices/expression/int_to_array.cwl
[17:24:24.594] INFO - Input settings file path: /research/rgs01/home/clusterHome/lding/develop/cwl/practices/expression/int_to_array.yml
[17:24:24.594] INFO - Output directory: /home/lding/cwl-workdir/20fed44a-9f27-4797-b886-28846559711f
[17:24:24.594] INFO - Work directory: /home/lding/cwl-workdir/20fed44a-9f27-4797-b886-28846559711f
[17:24:24.594] INFO - Workflow "int_to_array" started to execute.
[17:24:24.871] INFO - Job (int_to_array) was submitted. Job <78813896> is submitted to queue .
[17:24:29.446] ERROR - Failed to wait for job int_to_array <78813896>, java.lang.String cannot be cast to java.lang.Long
[17:24:29.446] ERROR - The job (int_to_array) exited.

ExpressionTool cannot access input directory listing files

Hi,

I have a simple ExpressionTool (attached) I wanted to test with the recently added feature. It takes in a directory as input and return's the directory's listing as a file array. However, I get an error when running this that states

09:54:43.736 default [pool-4-thread-1] ERROR c.i.s.c.e.e.lsf.LSFWorkflowRunner - Failed to capture output for job (directory_to_files): The file "/home/kbrown1/ExpressionToolDirectoryError/workdir/1bd21483-9530-48e9-bc2c-f7d129b428e9/2.tmp" cannot be accessed.

It also exits with exit code 0 instead of a non-zero exit code, but returns no output.

As far as I can tell it seems to be an issue with the input directory listing. I tested swapping the listing attribute for basename to simple return a string of the directory's name and this worked successfully. It seems to know what outputs to collect, just can't actually collect them.

ExpressionToolDirectoryError.tar.gz

cwlexec fails to fail when a required input is not provided.

A simple workflow which requires a string runs even when no input is provided. cwltool fails to run on the same example as expected.

[jmichael(BASH)@nodecn011]: cwlexec foo.cwl 
[15:51:20.864] INFO  - Workflow ID: ca52aa98-d283-4610-afde-b56a6b8e1ad9
[15:51:20.865] INFO  - Name: foo
[15:51:20.865] INFO  - Description file path: /research/rgs01/home/clusterHome/jmichael/cwlexec_bugs/non-optional-inputs/foo.cwl
[15:51:20.865] INFO  - Output directory: /home/jmichael/cwl-workdir/ca52aa98-d283-4610-afde-b56a6b8e1ad9
[15:51:20.865] INFO  - Work directory: /home/jmichael/cwl-workdir/ca52aa98-d283-4610-afde-b56a6b8e1ad9
[15:51:20.865] INFO  - Workflow "foo" started to execute.
[15:51:20.870] INFO  - Started job (foo) with
bsub \
-cwd \
/home/jmichael/cwl-workdir/ca52aa98-d283-4610-afde-b56a6b8e1ad9 \
-o \
%J_out \
-e \
%J_err \
-env \
TMPDIR=/home/jmichael/cwl-workdir/ca52aa98-d283-4610-afde-b56a6b8e1ad9 \
echo
[15:51:20.993] INFO  - Job (foo) was submitted. Job <61886769> is submitted to queue <normal>.
[15:51:21.009] INFO  - Started to wait for jobs by
bwait \
-w \
done(61886769)
[15:51:25.188] INFO  - The job (foo) <61886769> is done with stdout from LSF:

{ }
[jmichael(BASH)@nodecn011]: echo $?
0
[jmichael(BASH)@nodecn011]: cat foo.cwl 
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool

baseCommand: echo

inputs:
  foo:
    type: string
    inputBinding:
      position: 1

outputs: []
[jmichael(BASH)@nodecn011]: cwltool foo.cwl 
/hpcf/apps/python/install/3.5.2/bin/cwltool 1.0.20180525185854
Resolved 'foo.cwl' to 'file:///research/rgs01/home/clusterHome/jmichael/cwlexec_bugs/non-optional-inputs/foo.cwl'
usage: foo.cwl [-h] --foo FOO [job_order]
foo.cwl: error: argument --foo is required
[jmichael(BASH)@nodecn011]: cwlexec --version
0.2.0
[jmichael(BASH)@nodecn011]:

File literal not written to safe path

Test [89/128] Test file literal as input

Test failed: /home/jenkins/cwlexec-0.1/cwlexec --outdir=/tmp/tmpgf1k1pyg --quiet v1.0/cat3-tool.cwl v1.0/file-literal.yml
Test file literal as input
Returned non-zero
Failed to write file "/common-workflow-language-master/v1.0/v1.0/file1-78a66506": /common-workflow-language-master/v1.0/v1.0/file1-78a66506 (Permission denied)

Here the user doesn't have permissions to write to /common-workflow-language-master/v1.0/v1.0/

consider using 'cwljava' for CWL parsing

Currently only classes for CWL v1.2 have been generated, but I'd be happy to generate classes to parse and represent CWL v1.0 and v1.1 documents as well (that is not much work for me)

http://github.com/common-workflow-language/cwljava

Alternatively we add a helper method so that submitted documents would be automatically be upgraded to the latest CWL version

ScatterFeature does not accept list of files as input: Java ClassCastException thrown

Not sure if I am doing anything wrong, but it works that way with the CWL tool description reference implementation. Any advice would be appreciated.

Here is the workflow and a YAML job description:
issue_43.zip

command:
$ unzip issue_43.zip
$ cd issue_43
$ cwlexec -X -p --workdir /home/user/output/ cmsearch-multimodel-wf.cwl jobs/cmsearch-multimodel-wf.test.job.yaml

cwlexec stops and reports the following:

16:13:44.227 default [main] DEBUG c.i.s.c.e.util.command.CommandUtil - The command input argument: 1000 for step cmsearch
16:13:44.227 default [main] DEBUG c.i.s.c.e.util.command.CommandUtil - The command input argument: 1000 for step cmsearch
16:13:44.227 default [main] DEBUG c.i.s.c.e.util.command.CommandUtil - The input (id=covariance_model_database, type=File, value=File:/home/user/maxim/output/cwlexec/7fd4f443-eab1-4fd5-b767-7a68add47c5d/cmsearch/tRNA5.c.cm) of step cmsearch
16:13:44.227 default [main] DEBUG c.i.s.c.e.util.command.CommandUtil - The command input argument: /home/user/maxim/output/cwlexec/7fd4f443-eab1-4fd5-b767-7a68add47c5d/cmsearch/tRNA5.c.cm for step cmsearch
16:13:44.227 default [main] DEBUG c.i.s.c.e.util.command.CommandUtil - The input (id=query_sequences, type=File, value=File:/home/user/maxim/output/cwlexec/7fd4f443-eab1-4fd5-b767-7a68add47c5d/cmsearch/mrum-genome.fa) of step cmsearch
16:13:44.227 default [main] DEBUG c.i.s.c.e.util.command.CommandUtil - The command input argument: /home/user/maxim/output/cwlexec/7fd4f443-eab1-4fd5-b767-7a68add47c5d/cmsearch/mrum-genome.fa for step cmsearch
16:13:44.227 default [main] DEBUG c.i.s.c.e.util.command.CommandUtil - Has Shell Command, build commands as:
[cmsearch, --tblout, mrum-genome.fa.cmsearch_matches.tbl, -o, mrum-genome.fa.cmsearch.out, --cpu, 1, --noali, --hmmonly, -Z, 1000, /home/user/maxim/output/cwlexec/7fd4f443-eab1-4fd5-b767-7a68add47c5d/cmsearch/tRNA5.c.cm, /home/user/maxim/output/cwlexec/7fd4f443-eab1-4fd5-b767-7a68add47c5d/cmsearch/mrum-genome.fa]
java.util.ArrayList cannot be cast to com.ibm.spectrumcomputing.cwl.model.process.parameter.type.file.CWLFile
16:13:44.235 default [Thread-3] DEBUG c.i.s.cwl.exec.CWLExec - Stop cwlexec...
16:13:44.236 default [Thread-3] DEBUG c.i.s.cwl.exec.CWLExec - cwlexec has been stopped

Bad Scatter in subworkflow

I have a CommandLineTool (foo.cwl), Workflow (inner.cwl) which calls foo.cwl, and another workflow (outter.cwl) which scatters over inner.cwl. Both foo.cwl and inner.cwl work as intended, but outter.cwl does not properly scatter over inner.cwl. That is, instead of scattering over the 'input_file' as it should, it simply calls 'foo' with all possible 'input_file's as input.

It should be:

foo --INPUT file1.txt
foo --INPUT file2.txt

but instead, it invokes foo as

foo --INPUT file1.txt file2.txt

This may be related to issue 20 which I saw was recently moved to 'enhancement' rather than 'known issue'. In either case I'm hoping we will be able to scatter over subworkflows in this way as it will be very useful for our pipelines.

In the attached 'BadScatter.tar.gz' you will see that '01_foo.sh' gives the expected output, '02_inner.sh' gives the expected output, but '03_outer.sh' fails and that the invocation is a single call to foo with all possible input files, though the intent is for them to be scatter over as described above.

BadScatter.tar.gz

$(inputs.other_file.nameroot) evaluates to Null upon file scatter

We have a simple example where we are trying to concatenate english.txt with french.txt, german.txt, and spanish.txt via scatter using $(inputs.other_file.nameroot) in the CommandLineTool being scattered. This works as desired in cwltool, but in CWLEXEC, we get no output files and the following information in debug mode (found also in the 'outfile.txt', attached):

16:15:47.191 default [main] DEBUG c.i.s.c.e.util.evaluator.JSEvaluator - Evaluated js expression "$(inputs.other_file.nameroot)" to A null object

NullJSObjectError.tar.gz

Collecting output for an array of glob patterns based on inputs fails

Hi,

We have a simple example that takes a string as input and outputs two files with names based on the input string. When using glob to find both files based on the name pattern $(inputs.name)_1.txt and $(inputs.name)_2.txt, the glob seems to interpret these as literal strings rather than evaluating $(inputs.name) in each case:

outfile.txt

182   "outputBinding" : {
183     "glob" : {
184       "patterns" : [ "$(inputs.name)_1.txt", "$(inputs.name)_2.txt" ],

then returns:

218 {
219   "out_file" : [ ]
220 }

If $(inputs.name) is changed to the exact string, it works, but it should evaluate these for pattern matching.

GlobOutputArrayError.tar.gz

SingularityRequirement

We would like to run containers safely in a multi-user LSF cluster. Docker has many security issues stemming from dockerd being run as root. Singularity is an alternative that gains more and more popularity in Science.

In the foreseeable future we need a way to execute Singularity containers (and thus by Singularities features also Docker!) in our multi-user LSF cluster. Thus, we need a SingularityRequirement analogous to the DockerRequirement in e.g. cwtool.

Optional workflow input with tool-level default evaluates to null

Hi,

I have a simple example (attached) where the input to a workflow is an optional string. The command line tool has a default value for this input. When I use no outputs or a fixed output file name (as in #19), it works. If I modify the command line tool to now glob for $(inputs.foo) to return a file matching the name of the input string, it evaluates the input to null and fails to build the command in this instance. It should be evaluating the default input for the tool if the optional input for the workflow is not given.

EvalOptionalWorkflowInputError.tar.gz

Failed to execute a workflow, when a step's input is from the previous step's cwl.output.json

When a step's input is from the previous step's cwl.output.json, e.g. there are two steps (s1 and s2), the input of s2 is from the output of s1 and the s1 outputs a cwl.output.json, it includes the s1 input filed, the cwlexec execute failed

dockerOptions.sh as documented fails on short options

the pre-exec script dockerOptions.sh

#!/bin/bash
for OPTION in $LSB_CONTAINER_OPTIONS
do
    echo $OPTION
done

works fine with long options (--env=VAR=value) but fails with short options (-e VAR=value) due to the whitespace between option and value. Instead it should simply print out $LSB_CONTAINER_OPTIONS as-is:

#!/bin/bash
echo "$LSB_CONTAINER_OPTIONS"

Arrays are not scattered when passed to subworkflows

Hi,

We have a simple example workflow that seems to be passing array inputs without scattering them to lower level scripts

top_workflow.cwl calls -> subworkflow.cwl calls -> echocat.cwl calls -> echocat.sh which takes 3 inputs (string, file, file).

subworkflow.cwl just has a single step which takes a string input and a File[] input and passes it to the command line tool. This works fine with CWLEXEC. When I use top_workflow.cwl to scatter over an array of strings or an array of arrays of files, they do not get scattered, but instead passed directly to the command line tool, where it fails because the shell script cannot use it this way. The string array as a single string and the File array of arrays as a single array. Attached is the example and in the output.txt file at line 646 the command is built incorrectly.

SubworkflowArrayScatterError.tar.gz

Error: Could not find or load main class

Command used:

./cwlexec /home/johnsoni/Innovation-Pipeline/workflows/QC/qc_workflow_wo_waltz.cwl ~/Innovation-Pipeline/test/workflows/EZ_QC_test.yaml
Error: Could not find or load main class com.ibm.spectrumcomputing.cwl.Application

I've downloaded and extracted the 0.2.2 release, is there any advice on this error?

Don't copy result files.

Currently, cwlexec copies files from the work directories to the output directory (here if I am correct).

If possible, avoid copying output files. These files can be huge (e.g. we usually have files 100 GB, but they can be much bigger; this is common with human whole genome sequencing files) and copying is really a waste of space and time. While space may not be a problem, because copies can be deleted after processing, time may be more of a problem in a network-based storage with tight requirements for short processing times (e.g. for routine cancer diagnostics).

Alternatives are (at least on POSIX filesystems):

Symlinking, should always work. One may think about using relative symlinks, in case the output and work directories will be moved.
Hardlinking, if the work and output directories are on the same filesystem.

I am not sure what the standard says about it, but even if the standard says "do copy", for some of our workflows we'd rather drop CWL than accept copies.

It may be desirable to give the user the possibility between copying, symlinking or hardlinking. However, replacing a symlink by the pointed to file is a small problem. So symlinking as default seems to be a reasonable default.

For both linking approaches file ownership may be more of an issue, because the access rights are identical for all hard/softlinks to the same data.

Failed to evaluate the expression in the [std] field, string is required

The fix for #36 resolved that issue but now I am running into an issue when I try to redirect stdout to a file within the CommandLineTool being scattered over. I have modified that example to show that the first step seems to succeed, but the second step fails with

Failed to evaluate the expression "$(inputs.output_file)" in the [std] field, string is required.

UndefinedVariableError2.tar.gz

Fails to resolve dependencies when one subworkflow relies on output of another

Hi,

We have a workflow which has two steps. Each step calls a subworkflow that calls a command line tool. When the steps/subworkflows are independent of each other the run succeeds (two_input_workflow.cwl), but when one step relies on output of another step (one_input_workflow.cwl) it fails to resolve the workflow, giving the following error:

09:18:12.103 default [pool-2-thread-1] ERROR c.i.s.c.e.e.CWLInstanceSchedulerTask - Fail to run one_input_workflow (Failed to resolve the step (flow2) dependencies.)
09:18:12.105 default [pool-2-thread-1] ERROR c.i.s.c.e.e.CWLInstanceSchedulerTask - The exception stacks:
com.ibm.spectrumcomputing.cwl.model.exception.CWLException: Failed to resolve the step (flow2) dependencies.
	at com.ibm.spectrumcomputing.cwl.exec.util.CWLStepBindingResolver.resolveStepInput(CWLStepBindingResolver.java:178)
	at com.ibm.spectrumcomputing.cwl.exec.util.CWLStepBindingResolver.resolveStepInput(CWLStepBindingResolver.java:142)
	at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowStepRunner.prepareStepCommand(LSFWorkflowStepRunner.java:158)
	at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowStepRunner.resovleExpectDependencies(LSFWorkflowStepRunner.java:111)
	at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowStepRunner.<init>(LSFWorkflowStepRunner.java:65)
	at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowRunner.addSteps(LSFWorkflowRunner.java:272)
	at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowRunner.<init>(LSFWorkflowRunner.java:99)
	at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowRunner.runner(LSFWorkflowRunner.java:92)
	at com.ibm.spectrumcomputing.cwl.exec.executor.CWLInstanceSchedulerTask.schedule(CWLInstanceSchedulerTask.java:76)
	at com.ibm.spectrumcomputing.cwl.exec.executor.CWLInstanceSchedulerTask.run(CWLInstanceSchedulerTask.java:62)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

The full output (1inp.out) is attached along with both the failure and success example. This seems to happen with just subworkflows as far as we can tell.

SubworkflowDependenciesError.tar.gz

Failed to bind value

After implementing the workaround in #34, I was able to get past that step and ran into a new problem at the next step in the workflow. In trying to rebuild a minimal example for this from scratch, I've run into new problems at the same step as #34 again.

My workflow is

Split a bam file into R1/R2 fastq files (simulating this with a 'split_reads.sh' file so that my examples are not dependent on external software. This is accomplished with the split_reads.cwl CLT.
Scatter over multiple input files using scatter_split.cwl. I was hoping the workaround in #34 would let me get past this part.

3+) Continue simulating my real workflow to reproduce the issues I'm seeing.

Step 2, above, is where I ran into issues on #34. In rebuilding a different workflow, I am getting new issues. Specifically, my CLT and scatter_split.cwl workflow both work in cwltool, but the scatter_split.cwl WF fails with CWLEXEC with the error Failed to bind value for [R1_file], The value cannot be found..

I've compared this workflow with the working flow from #34 and I think they are very similar, so I'm not sure why this one is failing. Is this an issue in CWLEXEC or my own code? The script 02_scatter_split_reads.sh in the attached example should reproduce this issue.

BindValueFailure.tar.gz

Undefined file on scatter

In an attempt to overcome #20 (using the same general example as in #33) I have moved the scatter down to the lowest level. However, I found that when building a filename with valueFrom inside the scatter that it returns undefined.

steps:
  foo:
    run: foo.cwl
    scatter: input_file
    in: 
      input_file: input_files
      output_filename:
        valueFrom: ${return inputs.input_file.nameroot + ".out";}
    out:
      [output_file]

However, this is not reproducible in cwltool where the filename gets correctly built. This seems like a cwlexec specific issue, but it could also be that I am not using best practices when building a string from within a scatter.

UndefinedFile.tar.gz

path provided, yet "The field [path, location or contents] is required."

https://ci.commonwl.org/job/cwlexec/8/testReport/conformance_test_v1/0/Test_dynamic_resource_reqs_referencing_the_size_of_Files_inside_a_Directory/

From a new conformance test:
https://github.com/common-workflow-language/common-workflow-language/pull/690/files

add the -n option to model.conf files

I have tried to setup an LSF.conf for a workflow that looks like this:

{
   "queue": "standard",
   "steps": {
       "step1": {
             "rerunnable": false,
             "res_req: "rusage[mem=20000]",
             "num_processors": 4
        }
   }
}

And it fails, so I stepped into your code here:
https://github.com/IBMSpectrumComputing/cwlexec/blob/master/src/main/java/com/ibm/spectrumcomputing/cwl/model/conf/FlowExecConf.java

and found that you have nothing that handles the -n option for LSF which would distribute to multiple processors.

I'm tagging as a feature enhancement because without this, you can't make a job distributable without hard-coding it in the cwl source, which we don't want backend users to have to do.

It seems like you could do it with adding:

    private int processors
    ...

    public int getProcessors(){
        return processors
    }
    /**
     * Sets the LSF resource requirement (-n) option
     *
     * @param processors
     *                     The LSF num_processors requirement option
     */
    public void setProcessors(int processors) {
          this.processors. = processors;
    }

To the files FlowExecConf.java and StepExecConf.java.

I don't know if there is anything else that needs changing but this seems like a great feature to add.

Cannot support code fragment in experssion

If an argument valueFrom expression is code fragment, e.g.

arguments:
 - prefix: -c
   valueFrom: |
     import json
     fileString = []
     with open("$(inputs.inputFile.path)", "r") as inputFile:
          for line in inputFile:
               fileString.append(line)
     with open("cwl.output.json", "w") as output:
         json.dump({"fileString": fileString}, output)

cwlexec cannot be evaluated correctly