Giter Club home page Giter Club logo

snakemake-gridengine's Introduction

sge

This profile configures Snakemake to run on (Sun) Grid Engine.

Setup

Deploy profile

To deploy this profile, run

mkdir -p ~/.config/snakemake
cd ~/.config/snakemake
cookiecutter https://github.com/Snakemake-Profiles/sge.git

Then, you can run Snakemake with

snakemake --profile sge ...

Cookiecutter options

  • profile_name : A name to address the profile via the --profile Snakemake option.
  • cluster_config : Path to a YAML or JSON configuration file analogues to the Snakemake --cluster-config option. This is also used to define custom resources on the SGE cluster.

Default snakemake arguments

Default arguments to snakemake maybe adjusted in the <profile path>/config.yaml file.

Cluster Files

Per rule configuration can be defined in a cluster file and passed in using --cluser-config. This is a yaml file where the key is the rule name followed by a list of SGE settings to add or override settings set in the profile. You can also add options to the __default__ config. NOTE that these are ADDED to the default and will be inheritted by any named rules.

An example local cluster config file (cluster.yaml) looks like:

__default__
	q: private.q
	
rule1:
	gpu:1
	
rule2:
	time: "4:0:0"

which will be used by specifying snakemake --profile sge --cluster-config cluster.yaml.

Parsing arguments to SGE (qsub)

Arguments are overridden in the following order, aliases are also defined and can be defined :

  1. QSUB_DEFAULTS in sge-submit.py
  2. Profile cluster_config file __default__ entries
  3. Snakefile threads and resources (time, mem)
  4. Profile cluster_config file entries
  5. --cluster-config parsed to Snakemake (deprecated since Snakemake 5.10)

Resource and option mapping

To allow more expressive resource requests we map some simple names to the SGE options and resources. These can be used for example in cluster.yaml to make the configuration simpler to read.

Notes

Custom SGE resources can be specified in __resources__ only in the profile folder (i.e. any __resources__ in a local --cluster-config cluster.yaml will be ignored, but you can request the resources defined in the global profile). Custom resources are specified as a YAML dictionary where the key is the resource name as defined in SGE and the values are any aliases you want to use for this resource. The key will always be avaiable as a name even if you don't specifiy it as an alias. If a key already exists in the resource list the the aliases are just appended to that resource.

For example:

__resources__:
  coproc_v100: 
    - "gpu"
    - "nvidia_gpu"

Allows you to request with coproc_v100=1, gpu=1 or nvidia_gpu=1 in the cluster config files or snakemake rule resources all of which will actually set -l coproc_v100=1 for qsub.

Memory (s_vmem, h_vmem and aliases) must be given in megabytes (NOTE: this is to support snakemake version >= 7 which sets a default mem_mb resource. In older versions of the grid engine profile the memory was in gigabytes).

Custom SGE options can be specified in __options__ in the profile folder in the same way as resources.

For example:

__options__:
  jc: 
    - "jc"
    - "job_class"

A full list of the default supported SGE options and resource requests with their aliases is:

SGE Option Accepted aliases
binding binding
cwd cwd,
e e, error
hard hard,
j j, join
m m, mail_options
M M, email
notify notify,
now now,
N N, name
o o, output
P P, project
p p, priority
pe pe, parallel_environment
pty pty,
q q, queue
R R, reservation
r r, rerun
soft soft,
v v, variable
V V, export_env
qname qname,
hostname hostname,
calendar calendar,
min_cpu_interval min_cpu_interval,
tmpdir tmpdir,
seq_no seq_no,
s_rt s_rt, soft_runtime, soft_walltime
h_rt h_rt, time, runtime, walltime
s_cpu s_cpu, soft_cpu
h_cpu h_cpu, cpu
s_data s_data, soft_data
h_data h_data, data
s_stack s_stack, soft_stack
h_stack h_stack, stack
s_core s_core, soft_core
h_core h_core, core
s_rss s_rss, soft_resident_set_size
h_rss h_rss, resident_set_size
slots slots,
s_vmem s_vmem, soft_memory, soft_virtual_memory
h_vmem h_vmem, mem_mb, mem, memory, virtual_memory
s_fsize s_fsize, soft_file_size
h_fsize h_fsize, disk_mb, file_size

Non Requestable Resources

On some cluster configurations some resources may be non-requestable.

snakemake-gridengine's People

Contributors

drjbarker avatar laelbarlow avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

snakemake-gridengine's Issues

If qsub fails no meaningful error is given

If a qsub fails, for example because a required resource (e.g. h_rt) is not found, then snakemake throws the exception from the subprocess that it returned with a non-zero return values.

We should see if there's an easy way to determine why qsub failed and print that for the end user.

A valid repository for "https://github.com/drjbarker/snakemake-gridengine.git" could not be found

Not strictly an issue on this repo but on https://github.com/Snakemake-Profiles/sge

When I try and install it, I get an error:

$ cookiecutter https://github.com/drjbarker/snakemake-gridengine.git
You've downloaded /home/ubuntu/.cookiecutters/snakemake-gridengine before. Is it okay to delete and re-download it? [yes]: yes
A valid repository for "https://github.com/drjbarker/snakemake-gridengine.git" could not be found in the following locations:
/home/ubuntu/.cookiecutters/snakemake-gridengine

I have no experience of cookiecutter so I am not even sure what it is trying to do....

[Question] How to specify qsub options within rules

I wanted to ask a question about how to set different SGE options for each rule in my Snakefile.

For example, suppose I want each of my rules to have a different job name when submitted. If I were submitting these jobs directly, I would use the -N option of the qsub command.

With Snakemake I know that one way to do this is to specify it in the cluster.yaml file. For instance, if the name of a rule in my Snakefile is run_test and I would like the qsub job to have the name testing, I can add the following to cluster.yaml after the __default__ settings:

"run_test":
  name: "testing"

I was wondering if it would be possible to specify options like this in the Snakefile instead? I'm particularly interested in specifying different -pe options for different rules directly in the Snakefile rather than through the cluster.yaml file.

I have tried this:

rule run_test:
    input: ...
    output: ...
    resources:
        name='testing'
    shell: ...

but get the following error: KeyError: 'Unknown SGE option or resource: name'. I'm not sure what I'm doing wrong?

Thank you very much for your time and for making this code available!

SGE profiles with snakmake 8...

I'm having an issue setting up a SGE profile with snakemake 8.10.0. I tried to use the executor flag but i couldn't set it to qsub or sge.

I get this error:
snakemake: error: unrecognized arguments: --cluster=qsub --sge-submit=sge/sge-submit.py --sge-status=sge/sge-status.py --sge-cancel=sge/sge-cancel.py

I also couldn't find official docs to set that up.

Thanks for your help!

'thread' specification not recognized?

Hi,

I'm trying to run jobs using the SGE profile, but I run into issues when using 'threads':

Specifying 'threads' in the snakemake rule seems to be recognized by snakemake (i.e. snakemake output shows 'threads=8', for example), but when I look in the cluster log only 2 threads are 'propagated' to the cluster job (I assume this is a default value to be used in case nothing else is specified)
I thought maybe I need to adjust the profile for this, so I added pe: 'threaded {threads}' under __options__ in the profile's cluster.yaml, but to no avail.

My question: how do I correctly specify multiple threads using the SGE profile? Should I simply use the rule's 'thread' property?
Or is there a better way?

Let me know if you need any more information.

Many thanks in advance and best!
johann

Future support of snakemake 7 new default resources mem_mib disk_mib

Hi,

I am working on a POC which utilizes snakemake 7.30.1, I saw there is a commit specifically to support mem_mb disk_mb.

And apparently, in 7.30.1, there are mem_mib, disk_mib, which breaks the wrapper:

Traceback (most recent call last):
  File "/home/derlin/snakie/snakemake-profile-demo/brenner/sge-submit.py", line 243, in <module>
    update_double_dict(qsub_settings, parse_qsub_settings(job_properties.get("resources", {}), option_mapping={}))
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/derlin/snakie/snakemake-profile-demo/brenner/sge-submit.py", line 159, in parse_qsub_settings
    raise KeyError(f"Unknown SGE option or resource: {skey}")
KeyError: 'Unknown SGE option or resource: mem_mib'

resources: mem_mb=1000, mem_mib=954, disk_mb=1000, disk_mib=954, tmpdir=<TBD>

I am wondering if there is a workaround?

Thanks
Derrick

Custom resources could be read from a file

At the moment we have to hard code custom resources into the sge-submit.py script. It would be convenient if we could read these in from a file so that the profile is easier to adapt for different clusters.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.