I’m a bit confused about how regular expressions work with python methods. In particular, there’s an exercise from leetcode (https://leetcode.com/problems/patients-with-a-condition/description/) that is in which we have to find in the word DIAB1 exists in the string. Seems easy, right? However, I can’t seem to get the correct regular expression to fit the str.match()
method. I ended using one of the posted solutions which uses str.contains()
instead of str.match()
but I’d like to get the full regular expression correct.
The three string that causes problems are:
- 'DIAB100 MYOP'
- 'ACNE DIAB100'
- 'SADIAB100'
The regular expression is supposed to match the first two but not the third. I’ve tried to following ones so far:
r'(?<=s|^)DIAB1' - ERROR: look-behind requires fixed-width pattern
r'(^|s)DIAB1' - Fails to match 'ACNE DIAB100'
r'bDIAB1' - Also fails to match 'ACNE DIAB100'
r'(^| )DIAB1' - Also fails to match 'ACNE DIAB100'
The only solution that worked for me was to use str.contains(r'(^DIAB1)|( DIAB1)')
. If I wanted to use str.match()
instead, what would be the correct regular expression?
1
Like re.match
, pandas’s .str.match
determines
if each string starts with a match of a regular expression.
r'bDIAB1'
with .str.contains
is essentially the correct regular expression solution. You could use match
to emulate contains
by accepting arbitrary content at the start of the string with r'.*bDIAB1'
, but there’s just no reason to.