I’m trying (unsuccessfully) to create a pattern (regex) to create groups of (up to) 5 letters/digits. Only letters (a-zA-Z) and digits (0-9) can be grouped, while spaces or other characters must be ignored.
As an example, a text with the following should create these groups:
The book is on the table and 12 is a number
group 1: "Thebo"
group 2: "okiso"
group 3: "nthet"
group 4: "ablea"
group 5: "nd12i"
group 6: "sanum"
group 7: "ber"
Is it possible to create a regex for this purpose?
7
I solved this issue using the CLOJURE language, but it was not possible (to me) create a single regex that would solve the problem without the help of logic using a programming language, that is, it was not possible to do so with just regex.
My question was whether I lacked sufficient knowledge in regex, or whether this type of issue really needed to be resolved the way I did (using a programming language).
Thank you to everyone who responded.
I consider my question resolved.
TEST STRING:
--1-abcd--2ABCD34--
REGULAR EXPRESSION (1):
(?:[^a-zA-Zd]*[a-zA-Zd]){1,5}
- Match 1: --1-abcd
- Match 2: --2ABCD
- Match 3: 34--
You would then have to remove the unwanted characters from each match.
REGULAR EXPRESSION (2):
[^a-zA-Zd]*([a-zA-Zd])[^a-zA-Zd]*([a-zA-Zd]?)[^a-zA-Zd]*([a-zA-Zd]?)[^a-zA-Zd]*([a-zA-Zd]?)[^a-zA-Zd]*([a-zA-Zd]?)
- Match 1: --1-abcd
- Group 1: 1
- Group 2: a
- Group 3: b
- Group 4: c
- Group 5: d
- Match 2: --2ABCD
- Group 1: 2
- Group 2: A
- Group 3: B
- Group 4: C
- Group 5: D
- Match 3: 34--
- Group 1: 3
- Group 2: 4
- Group 3: empty string
- Group 4: empty string
- Group 5: empty string
You would then have to remove the unwanted characters from each match or concatenate the capturing groups from each match.