The problem is I can only get either the first or last white space, while im trying to get both in one re.sub use.
Ive tried this regex which manages to get any white space after a number which is not really what i need here is the example
"(?<=d)s"mg
I cant use groups 1 and 4 because the amount of groups can change with other strings. The first white space will always be after the date which is always formatted the same, and the last will be before the cost of the thing, but the placement of the decimal or amount of numbers might change depending on cost.
Anyone have any thoughts?
SeatofMercy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
You could use capture groups here:
inp = "08/26 Card Purchase blah blah IL Card 0000 $14.00"
output = re.sub(r'^(S*)s+(.*?)s+(S*)$', r'1|2|3', inp)
print(output) # 08/26|Card Purchase blah blah IL Card 0000|$14.00
This regex approach works by matching:
^
from the start of the string(S*)
an optional non whitespace term in1
s+
one or more whitespace characters(.*?)
capture in2
the middle portion of the strings+
one or more whitespace characters at the end of the string(S*)
capture final optional non whitespace term in3
$
end of the string
Essentially the above is using a splicing trick to remove the first and last whitespace.