OK ewes experts which of the following python pandas is more :performant
1) using a python list comprehension
import pandas as pd
df = pd.read_csv('datasets/large_columnset.csv', index_col=0)
df.columns = [x.lower().strip() for x in df.columns]
df.head()
2) using built in pandas dataframe functionality
df = pd.read_csv('datasets/large_columnset.csv', index_col=0)
df = df.rename(mapper=lambda str: str.lower().strip(), axis='columns')
df.head()
google gemini states:
“In most cases, using a Python list comprehension is more performant than using built-in pandas DataFrame functionality for column renaming.”
Ive tried both they produce same result
user27373850 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
3
Talk about the performance, first of all I think we should care if it potentially get huge amount. So I don’t think the header with a few, even a thousand columns would be the case.
Try to say so if we care about how the function work, the df.columns
is writing the columns
attribute at once. But the df.rename
has to go through several logic and input of arguments then writing columns
is one of many options. The logic we can read in rename.
=> The columns
assignment will always faster for sure. But if we want to rename just a few cols and with transformation, we need rename