I have same code write with python pandas. I want convert it to python polars code. Any once can help me about this issue.
My pandas code is:
df = df1.melt(id_vars=["_id", "basic_info/Date", "basic_info/MO_Name", "basic_info/MO_Email", "basic_info/Programme", "basic_info/Component", "basic_info/Project", "repeat/Activity", "basic_info/Division", "basic_info/District", "basic_info/Upazila", "basic_info/Union_Ward",
"basic_info/Branch_Office", "basic_info/Facility_Name", "basic_info/Data_Type", "basic_info/Respective_Staff", "basic_info/Staff_Designation", "basic_info/Respective_Manager", "basic_info/Respective_Supervisor", "basic_info/Study_Manager",], var_name='Indicator', value_name='Findings')
df.dropna(subset=['Findings'], inplace=True)
mask = df['Findings'].str.startswith('[FF]').fillna(False)
df = df[mask]
df['Findings'] = df['Findings'].str[5:]
df['Findings_id'] = df['_id'].astype(
str) + "-" + df['Indicator'].str[-12:]
cols = ['basic_info/Project', 'basic_info/Data_Type',
"basic_info/Component", "repeat/Activity"]
df[cols] = df[cols].apply(lambda x: x.str.replace('_', ' '))
df = df.drop(columns=['_id', 'Indicator'])
df['basic_info/Date'] = pd.to_datetime(df['basic_info/Date'])
I want write it with python polars. I try to convert it below:
df = pl.DataFrame(df1)
# print(df)
df = df.drop('formhub/uuid', '__version__', 'meta/instanceID', '_xform_id_string', '_uuid', '_attachments', '_status', '_geolocation',
'_submission_time', '_tags', '_notes', '_validation_status', '_submitted_by', 'basic_info/Consent', 'basic_info/Village')
df = df.melt(id_vars=["_id", "basic_info/Date", "basic_info/MO_Name", "basic_info/MO_Email", "basic_info/Programme", "basic_info/Component", "basic_info/Project", "basic_info/Activity", "basic_info/Division", "basic_info/District", "basic_info/Upazila", "basic_info/Union_Ward",
"basic_info/Branch_Office", "basic_info/Facility_Name", "basic_info/Data_Type", "basic_info/Respective_Manager", "basic_info/Respective_Supervisor", "basic_info/Study_Manager"], variable_name='Indicator', value_name='Findings')
# print(df)
# df.drop_nulls(subset=['Findings'])
df = df.drop_nulls(subset=['Findings'])
df = df.with_columns(pl.col('Findings').str.starts_with('[FF]').fill_null(False))
print(df)
# mask = df['Findings'].str.startswith('[FF]').fillna(False)
# df = df[mask]
# df = df.select(pl.col('Findings').str[5:])
# df['Findings'] = df['Findings'].str[5:]
df = df.with_columns((pl.col('_id').cast(pl.String) + "-" + pl.col('Indicator').str[-12:]).alias('Findings_id'))
print(df)
But get some error messages.
New contributor
Mahbubur RahmanMEALBRAC is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.