I would like to keep columns that contains word “FAIL”.
Input data:
Values1 | Values2 | Values3 | Status1 | Status2 | Status3 |
---|---|---|---|---|---|
1 | 1 | 1 | PASS | PASS | FAIL |
2 | 2 | 2 | PASS | PASS | PASS |
3 | 3 | 3 | PASS | PASS | PASS |
4 | 4 | 4 | PASS | FAIL | PASS |
Expected output:
Status2 | Status3 |
---|---|
PASS | FAIL |
FAIL | PASS |
Current Output:
Status1 | Status2 | Status3 |
---|---|---|
PASS | PASS | FAIL |
PASS | FAIL | PASS |
My code:
import pandas as pd
values = range(1,5)
status_pass = ["PASS"]*len(values)
status1 = status_pass[1:]+["FAIL"]
status2 = status1[::-1]
df = pd.DataFrame({"Values1":values,"Values2":values,"Values3":values,"Status1":status_pass,"Status2":status1,"Status3":status2})
# drop unwanted rows
words_to_keep = ["FAIL"]
df = df[df.stack().groupby(level=0).apply(
lambda x: all(x.str.contains(w, case=False).any() for w in words_to_keep))]
# Filter by column name
df = df.filter(like='Status', axis=1)