I’ve got a dataframe and I want to keep a “base” (original copy) which I can work from later. However when I use it and try to create a new dataframe, it’s updating the original base copy too.
I’ve created a simple example which replicates my problem:
import pandas as pd
# create base df
base=pd.DataFrame([[1,2],[3,4]])
# simple function that does something to the df
def dosomething(df) :
df[0]=[5,6]
return(df)
# create a new df
new=dosomething(base)
# print new and base dfs
print(new)
print(base)
This gives the following output:
0 1
0 5 2
1 6 4
0 1
0 5 2
1 6 4
So both my “new” dataframe has updated as I’d expect, but also the “base” dataframe has (which I was expecting to stay as the original).
What am I missing?
Michael Wilson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2