I am using gregexpr() to extract multiple pattern matches from large textstreams.
matchList = gregexpr( pattern, textstream )
After extracting multiple matches (hundreds), I want to extract the matchstrings:
matchStrings = regmatches( textstream, matchList )
I am wondering if there might be a way to parallelize the match extraction with mclapply() or such:
matchStrings = mclapply( mc.cores = numcores, matchList, regmatches, x = textstream )
I’ve also tried:
matchStrings = mclapply( mc.cores = numcores, matchList[[1]], regmatches, x = textstream )
but when I try this, I get errors from all of the cores.
I suspect it has something to do with the way that I am presenting the matchList to mclapply().
Suggestions appreciated.
I am expecting some sort of list / vector structure to be returned from mclapply() with all of the matches from the gregexpr()