I am currently performing a massive merge of all my VCFs but I am running into issues. So far I have tried 2 methods. Each method seems to either take too long (not sure if it will even complete) or have a java heap memory problem even after increasing it to 128Gbs.
Github Repository: https://github.com/mkirsche/Jasmine
My question is, am I going about this in the wrong way? Is there a different method I could try?
General Parameters (need to stay at these levels):
# Jasmine parameters
MAX_DIST=1000
MIN_CLUSTER_SIZE=1
MIN_AF=0.0
MIN_GQ=0
MIN_VAR_QUAL=0
MAX_READS=1000000
THREADS=10
Method 1:
Jasmine merge all 1000 VCFs.
Currently run time (still running): 19 days
Method 2:
Jasmine merge 1000 VCFs to 21 VCFs (51 VCF per merge)
Method 2.1, failed attempt:
Jasmine merge 21 VCFs to 1 VCF, get the following error:
etc
Merging graph ID: chrY_chrY_TRA_--:8
Merging graph ID: chrY_chrY_TRA_--:9
Merging complete - outputting results
Number of sets with multiple variants: 869288
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space: failed reallocation of scalar replaced objects
The error persists even after increasing java’s memory to 32Gbs -> 64Gbs -> 128Gbs using the following:
java -Xmx128G -jar /homes/eblak01/.conda/envs/stats/bin/jasmine.jar
Method 2.2, where i am at:
Jasmine merge 21 VCFs to 7 VCfs.
Successful merging.
Method 2.2.1, failed attempt:
Jasmine merge 7 VCFs to 1 VCF.
Same error:
java -Xmx128G -jar /homes/eblak01/.conda/envs/stats/bin/jasmine.jar
Method 2.2.2, trying it out right now:
Jasmine merge 7 VCFs to 3 VCFs (1 VCF isnt merged)