I was wondering what the cleanest way to run two different shell commands depending on if a parameter is given in a snakemake config file. For the moment I am using this setup:
rule deeptools_bamCoverage_pe:
input:
bam="DATA/BAM/{sample_sp}_pe.bam",
bai="DATA/BAM/{sample_sp}_pe.bam.bai"
output:
"DATA/BIGWIG/{sample_sp}_pe_RPGC.bw"
log:
"snakemake_logs/deeptools_bamCoverage/{sample_sp}_pe.log"
run:
if config["excluded_regions"]:
shell("bamCoverage --normalizeUsing RPGC -bl "+config["excluded_regions"]+
" --effectiveGenomeSize $((2913022398-"+config["excluded_regions"].split(".")[0].split("_")[-1]+")) -e -b {input.bam} -o {output} 2>{log}")
else:
shell("bamCoverage --normalizeUsing RPGC --effectiveGenomeSize 2913022398 -e -b {input.bam} -o {output} 2>{log}")
My config file may or may not contain excluded_regions: DATA/GENOMES/DAC_excl_HG38_71570285.bed
depending on if the user wants to filter out some regions from the analysis. 71570285
is the total length of the excluded regions, which is needed for the normalisation calculations (this could be passed in the config file too but I am trying to keep it light to not scare away my potential users).
Is there a cleaner way to present snakemake with two different shell lines that it can run?