I’m currently having an hard time solving and generalizing the following problem:
I have a table, which is stored as a nested list, and I have a text which contains the content of the table as well as other text.
I do not now how the table is convererted in string in the text exactly: sometimes there is a space separating the cells, sometimes a new line, sometime both and sometimes nothing.
My goal is to find the block of text containing the table and substituting it with a placeholder (i.e. table_1).
I was thinking of using regular expressions so I looked into it and in the end I did something like this:
table_as_string = ""
for row in table:
for elem in row:
table_as_string += elem + r"[rntfv ]*"
Before using the sub method I tried to look if the regular expression of the table I’m using as a sample is in the text as follows:
result = re.search(table_as_string, text)
This returns that it didn’t find any matching but I’m sure there should be one. I also tries using s* but nothing. Am I doing something wrong? The text in the cells can have characters like *, , n, etc.., can this be a problem?
Any help is apreciated!
0