I have an Excel file where some of the cells got merged and some cells have warp Text with Strike and Non Strike in it.
I want to write the data that non strike records to one Excel and other Strike records to another Excel along with the strike in the cell.
My output1:
col1 | col2 | col3 | col4 |col5 |
Sampletext1 | Combines TextC2 |Sample3 | text4 |text5 |
Sampletext2 | Combines TextC2 |Sample3_1 | text4_1 |text5 |
Sampletext2 | Combines TextC2 |Sample3_1 | text4_2 |text5 |
My output2:
col1 | col2 | col3 | col4 |col5 |
Sampletext2 | Combines TextC2 |Sample3_1 | text4_3 |text5 |
‘text4_3` should be striked out in the output file as well.
I have tried using openpyxl
in python
from openpyxl import load_workbook
from openpyxl import Workbook
import pandas as pd
input_file = 'myexecel.xslx'
Workbook = load_workbook(input_file)
for i in Workbook.worksheets:
if i.sheet_state == "visible":
sheetname = str(i).replace('<Worksheet "', '').replace('">', '').strip()
ws = Workbook[sheetname]
#unmerging the cells
for merged_cell in ws.merged_cells:
min_row, min_col, max_row, max_col = merged_cell.min_row, merged_cell.min_col, merged_cell.max_row, merged_cell.max_col
data = ws.cell(row=min_row, column=min_col).value
ws.unmerge_cells(start_row=min_row, start_column=min_col, end_row=max_row, end_column=max_col)
for row in ws.iter_rows(min_row=min_row, min_col=min_col, max_row=max_row, max_col=max_col):
for cell in row:
cell.value = data
data_all = [[cell for cell in row] for row in ws.iter_rows(values_only=True)]
df_raw = pd.DataFrame(data_all[1:], columns=headercols)
df_raw["strike_flag"] = [any(cell.font.strikethrough for cell in row) for row in ws.iter_rows(min_row=2)]
with the above code I was able to find whether the cell have strike through or not. But not sure how to separate strike through and non strike through records.