def regex = 'crt1234[a-z]_(\w\w)_DTS(.*)'
Pattern pattern = Pattern.compile(regex)
Matcher matcher = pattern.matcher("crt1234_DH_DTS")
matcher.findAll()
How do I loop through the matches? When I try the following it throws an IllegalStateEx
for(i in 1..<matcher.count)
group += matcher.group(i)
I am able to concat the groups using the following code
matcher.findAll() { newGroup -> group += newGroup }
3
The call to the group method will always fail if there has been no attempt to match the input which is what you are experiencing. Before any calls to the group()
method, you need to actually attempt to match. This can be done by calling the find()
method on the matcher prior to calling the group()
method.
Here is an example found on Geeks for Geeks (modified for your use case):
import java.util.regex.Matcher
import java.util.regex.Pattern
// Get the regex to be checked
String regex = "crt1234[a-z]?_(\w\w)_DTS(.*)";
// Create a pattern from regex
Pattern pattern = Pattern.compile(regex);
// Get the String to be matched
String stringToBeMatched = "crt1234_DH_DTS";
// Create a matcher for the input String
Matcher matcher = pattern.matcher(stringToBeMatched);
//find match
matcher.find()
//get matched group at index 1 (0 will be full match, not group)
System.out.println(matcher.group(1)); //outputs: DH
After some more investigation, it seems calling the matcher.count
property (groovy enhancement), the groups are cleared and matcher.find()
does not result in any more groups. To work around this, you can call matcher.reset()
after accessing the matcher.count
, then calling matcher.find()
works as expected and you can access the matched groups using matcher.group()
. Here is an example:
String regex = "crt1234[a-z]?_(\w\w)_DTS(.*)";
Pattern pattern = Pattern.compile(regex);
String stringToBeMatched = "crt1234_DH_DTS";
Matcher matcher = pattern.matcher(stringToBeMatched);
//get the count (this will break matcher.group() until matcher.reset() is called
Integer count = matcher.count
//reset the matcher and match the groups
matcher.reset()
matcher.find()
//get matched groups
for(Integer i=0;i<=count;i++){
System.out.println(matcher.group(i));
}
5
- In Java (and by extension Groovy)
Matcher
iterates overmatches
of yourpattern
, not groups - Groups (AKA capturing groups) are those things you put in parenthesis:
(ww)
and(.*)
. You can iterate over them, sure, but the number of groups doesn’t vary from string to string – it’s always the ones you defined in your pattern. The number of matches can obviously vary - In Java (and by extension Groovy)
Matcher
follows a streaming-like approach – you can progress by calling.find()
, but you can’t go backwards. Methods likegetCount()
andsize()
consume the “stream” and you have toreset()
to iterate over it again - You can make your life a little easier if you use Groovy as Groovy, not as C
// create a pattern with a `~` operator and use `slashy` strings
def pattern = ~/crt1234[a-z]?_(ww)_DTS(.*)/
// create a matcher with a `=~` operator
def matcher = 'crt1234_DH_DTS' =~ pattern
// iterate with an `Iterator`
matcher.each {
println it[1] // `1` here refers to group 1
}
1