I am working on launching a snakemake (v7.8.5) via a profile.
The error that I am getting is when I try to set a workflow-wide resource limit. My profile looks like so:
cluster: '/opt/sge/bin/lx-amd64/qsub -q {q} -e {log} -o {log} -j y'
jobs: 10
jobname: '{rule}.{params.jobname}.{jobid}'
keep-going: True
latency-wait: 600
resources: mem_mb=108000
When I do a dry run with the following command:
snakemake -np -s Snakefile --profile ~/path/to/spades/profile/
I get an error as soon as the workflow reaches the first rule where I specify resources
(I am not specifying resources
for all rules). The error can be found below:
ines 1-23Traceback (most recent call last):
File "/home/ngs/.conda/envs/diagnostics/lib/python3.7/site-packages/snakemake/__init__.py", line 810, in snakemake
keepincomplete=keep_incomplete,
File "/home/ngs/.conda/envs/diagnostics/lib/python3.7/site-packages/snakemake/workflow.py", line 1133, in execute
raise e
File "/home/ngs/.conda/envs/diagnostics/lib/python3.7/site-packages/snakemake/workflow.py", line 1129, in execute
success = self.scheduler.schedule()
File "/home/ngs/.conda/envs/diagnostics/lib/python3.7/site-packages/snakemake/scheduler.py", line 423, in schedule
self._finish_jobs()
File "/home/ngs/.conda/envs/diagnostics/lib/python3.7/site-packages/snakemake/scheduler.py", line 538, in _finish_jobs
self._free_resources(job)
File "/home/ngs/.conda/envs/diagnostics/lib/python3.7/site-packages/snakemake/scheduler.py", line 578, in _free_resources
value = self.calc_resource(name, value)
File "/home/ngs/.conda/envs/diagnostics/lib/python3.7/site-packages/snakemake/scheduler.py", line 878, in calc_resource
if value > gres:
TypeError: '>' not supported between instances of 'TBDString' and 'int'
It seems to me that snakemake is making a confusion between per-rule resources and workflow-wide resources.
I have modified the profile like so (based on the official snakemake documentation):
cluster: '/opt/sge/bin/lx-amd64/qsub -q {q} -e {log} -o {log} -j y --resources mem_mb=108000'
jobs: 10
jobname: '{rule}.{params.jobname}.{jobid}'
keep-going: True
latency-wait: 600
#resources: mem_mb=108000
I do not get an error when I do a dry run, however, I am not sure that snakemake is setting the workflow-wide memory to the value that I am passing to cluster
in the profile. I am assuming that because I see resources: mem_mb=<TBD>, disk_mb=<TBD>, tmpdir=/tmp
in the rule all
output of my dry run:
[Wed Sep 25 10:20:04 2024]
localrule all:
input: [list of all input files]
jobid: 0
reason: Input files updated by another job: [list of all output files]
resources: mem_mb=<TBD>, disk_mb=<TBD>, tmpdir=/tmp
On the top of that, I do get an error when I submit the workflow: qsub: invalid option argument "--resources"
, which is clear to me: resources
is not a qsub
argument.
However I am still puzzled by how snakemake cannot parse the resources
from the profile yaml file, while I can parse other flags such as jobs
, jobname
, keep-going
etc.
Just to confirm, --resources
is present as a snakemake command-line argument in the version that I am using:
(diagnostics) [ngs@vngs20x ~/installed/snakemake/spades]$ snakemake --help|grep resources
[--resources [NAME=INT [NAME=INT ...]]]
[--set-resources RULE:RESOURCE=VALUE [RULE:RESOURCE=VALUE ...]]
[--default-resources [NAME=INT [NAME=INT ...]]]
--resources [NAME=INT [NAME=INT ...]], --res [NAME=INT [NAME=INT ...]]
Define additional resources that shall constrain the
E.g. --resources mem_mb=1000. Rules can use resources
by defining the resource keyword, e.g. resources:
What would be the best way to set workflow-wide resources via the profile? Do I have a syntax issue on my profile?