I am completely in agreement with Python’s choice to not include end
in range(start, end)
. It is extremely convenient for indexing, but I have come across the problem of parsing identifiers that contain only digits, letters and underscore and I came up with this solution: create the following string translator dictionary:
identifier_eraser = ( # map ASCII codes to empty string
# digits ['0','9'] = [48, 57[
{code: "" for code in range(48, 58)} |
# uppercase ['A','Z'] = [65, 91[
{code: "" for code in range(65, 91)} |
# lowercase ['a','z'] = [97,123[
{code: "" for code in range(97,123)} |
# underscore
{ord("_"): ""}
)
The logic is that if you have a string s
and the translation s.translate(identifier eraser)
is empty, then it will be a valid identifier (usually there is the additional condition that the first symbol cannot be a digit, but that’s easy to check and join with a logical AND).
However, to make it more readable I decided to change it to:
identifier_eraser = ( # map ASCII codes to empty string
# digits ['0','9'] = [48, 57[
{code: "" for code in range(ord('0'), ord('9') + 1)} |
# uppercase ['A','Z'] = [65, 91[
{code: "" for code in range(ord('A'), ord('Z') + 1)} |
# lowercase ['a','z'] = [97,123[
{code: "" for code in range(ord('a'), ord('z') + 1)} |
# underscore
{ord('_'): ""}
)
and suddenly I have to add a +1
. This was not a problem before because it was implicitly computed into the end number of the range, but if we want to do the same here, the endpoints would be ord(':')
, ord('[')
and ord('{')
which is definitely more obscure.
So this displays the problem, which is that whether including the end in a range is convenient or not depends on the application. If your range is a range of indices then almost surely you do not want the endpoint, while if your range is content/data/objects then you will likely want the endpoint included.
Is there a sibling function, e.g. interval
, that is exactly like range
but including the endpoint (looking around I can already see there isn’t)? If not, is it planned? If on the other hand, it has been discussed and rejected, what are the good reasons not to have such a function? Or do you see a way to make the above code nicer without having to discuss the behavior or range
? In which case, do you believe any inclusive range can be always made into a pythonically-readable non-inclusive range?