I am using Regular Expression in Python 3.11 (because it allows the (?>...)
pattern, https://docs.python.org/3/library/re.html) to transform the bellow string to a dictionary by an interactive match pattern:
string = '''Latitude (degrees): 4010.44 Longitude (degrees): 58.000 Radiation database: year month H(h)_m 2005 Jan 57.77 2005 Feb 77.76 2005 Mar 120.58 H(h)_m: Irradiation plane (kWh/m2/mo)'''
for match in re.finditer(r'(?P<key>(?>[A-Z][ a-z_()]*)): *(?P<value>.+?)(?: |$)', string):
# Key is the short pattern before ":" starting with a uppercase letter
# Value must be the remaining, after the ": " and before the next key.
print(match[1], ":", match[2])
I haven’t been able to return:
Latitude (degrees): 4010.44
Longitude (degrees): 58.000
Radiation database: year month H(h)_m 2005 Jan 57.77 2005 Feb 77.76 2005 Mar 120.58
H(h)_m: Irradiation plane (kWh/m2/mo)
And know that is because the (?P<value>.+?)
short match pattern, but removing ?
, the <value>
also captures some unintended <key>
.
How to long match and stop the <value>
group match before the next <key>
?