I am new to Python and trying to find a way of comparing one column of strings to another column of strings for the length of the dataset. I would like to create a binary variable as the result to answer if the one string was found within the other.
df = {‘col_one’: [‘truck’, ‘car’, ‘bike’, ‘boat’, ‘scooter’], ‘col_two’: [“[‘truck’, ‘boat’]”, “[‘bike’, ‘boat’, ‘car’]”, “[‘scooter’]”, “[‘truck’, ‘car’]”, “[‘bike’, ‘car’]”]}
df = pd.DataFrame(data=df)
I have tried code like this to search a Column Two string for a Column One string and have had success:
import re
def comp_strs(str1, str2):
pattern = re.compile(str1)
match = re.search(pattern, str2)
if match:
print(‘Found’)
else:
print(‘Not found’)
str1 = “Truck”
str2 = “‘Truck’, ‘Boat'”
comp_strs(str1, str2)
However, I am struggling to expand on this to use the Column_one and Column_two inputs rather than individual strings as well as iterate through the length of the columns/length of the dataframe and store the binary result in a new column.
Any guidance or suggestions would be appreciated!
user24645857 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.