I have a problem with static typing of python regular expression.
Just to be clear, this question has nothing to do with the regular expression itself and my code is perfectly running even though it is not passing mypy strict verification.
Let’s start from the basic, I have a class defined as follows:
from __future__ import annotations
import re
from typing import AnyStr
class MyClass:
def __init__(self, regexp: AnyStr | re.Pattern[AnyStr]) -> None:
if not isinstance(regexp, re.Pattern):
regexp = re.compile(regexp)
self._regexp: re.Pattern[str] | re.Pattern[bytes]= regexp
The user can build the class either passing a compiled re pattern or AnyStr.
I want the class to store in the private _regexp attribute the compiled value. So I check if the user does not provided a compiled pattern, then I compile it and assign it to the private attribute.
So far so good, eventhough I would have expected self._regexp to be type re.Pattern[AnyStr] instead of the union of the type pattern types. Anyhow, up to here everything is ok with mypy.
Now, in some (or most of the) cases, the user provides the regexp string via a configuration TOML file, that is read in, parsed in a dictionary. For this case I have a class method constructor defined as follow:
@classmethod
def from_dict(cls, d: dict[str, str]) -> MyClass:
r = d.get('regexp')
if r is None:
raise KeyError('missing regexp')
return cls(regexp=r)
The type of dictionary will be dict[str, str].
I have to check that the dictionary contains the right key to prevent a None type in case the get function cannot find it.
I get the error:
error: Argument “regexp” to “MyClass” has incompatible type “str”; expected “AnyStr | Pattern[AnyStr]” [arg-type]
That looks bizarre, because str should be compatible with AnyStr, but let’s say that I modify the dictionary typing to dict[str, AnyStr]
But instead of fixing the problem, it multiplies it because I get two errors:
error: Argument “regexp” to “MyClass” has incompatible type “str”; expected “AnyStr | Pattern[AnyStr]” [arg-type]
error: Argument “regexp” to “MyClass” has incompatible type “bytes”; expected “AnyStr | Pattern[AnyStr]” [arg-type]
I am about to give up. It looks like I am in a loop, when I think I have fixed something, I just moved the problem back elsewhere.
Can you help me with this puzzle?
2