I want to fit some experimental data to a power curve of the form Y = c1*Re^c2
where I have some experimental points of Re
and Y
.
I want to do this fitting in intervals. Thus, I want to find (for example) 3 curves (so 3 different sets of coefficient C1
and C2
) fitting Y
to Re depending on the value of Re
.
Right now, I am doing this manually, but this is very inconvenient, long and “dirty”. Especially considering that, if I want the number of intervals to be arbitrary, it becomes super long to code (imagine dividing the span of Re
into 10 or more intervals).
How can I make this smarter and shorter?
Here is the current code (I am using pandas to store the data, but maybe you have better suggestions here). In the code, I find the three sets of coefficients for intervals of Re
between 0 and 150, between 150 and 1200 and for Re > 1200
.
from scipy.optimize import curve_fit
import pandas as pd
def fitting_function(X, C1, C2):
Re = X
Y = C1*(Re**C2)
return Y
df_fit = pd.DataFrame()
#my experimental data:
Re =[8,39,65,117,196,299,331,480,504,554,771,831,957,1180,1181,1472,1515,1558,1571,1709,1863,1931,2010,2127,2222]
Y = [435263,85974,52075,28727,18407,13949,13090,10285,9963,9365,8368,8243,8007,7684,7683,7352,7300,7269,7257,7136,7014,6955,6903,6831,6766]
df_fit["Re"] = Re
df_fit["Y"] = Y
Re_intervals = [150,1200]
df_fit_1 = df_fit[df_fit["Re"]<=Re_intervals[0]]
df_fit_2 = df_fit[df_fit["Re"]<=Re_intervals[1]][df_fit["Re"]>Re_intervals[0]]
df_fit_3 = df_fit[df_fit["Re"]>Re_intervals[1]]
df_fit_list = [df_fit_1, df_fit_2, df_fit_3]
dict_Coeff = {"C1":[], "C2":[]}
for df_ in df_fit_list:
X_fitting = df_["Re"]
Y_fitting = df_["Y"]
C, pcov = curve_fit(fitting_function, X_fitting, Y_fitting)
dict_Coeff_Nu["C1"].append(C_Nu[0])
dict_Coeff_Nu["C2"].append(C_Nu[1])
print("Coefficients Nu ", dict_Coeff_Nu)
I am using Python, but this is more of a logic/algorithm question.
2
You could use pd.cut
to bin your data according to Re
then pass your binned dataframe to a function that will return to you the list of dataframes you are looking for:
Here is a cleaner version of your code that tackles the repetitive issue you have:
from scipy.optimize import curve_fit
import pandas as pd
def fitting_function(X, C1, C2):
Re = X
Y = C1*(Re**C2)
return Y
#my experimental data:
Re =[8,39,65,117,196,299,331,480,504,554,771,831,957,1180,1181,1472,1515,1558,1571,1709,1863,1931,2010,2127,2222]
Y = [435263,85974,52075,28727,18407,13949,13090,10285,9963,9365,8368,8243,8007,7684,7683,7352,7300,7269,7257,7136,7014,6955,6903,6831,6766]
df_fit = pd.DataFrame({'Re': Re, 'Y': Y})
#### REPLACE THIS PART:
# Re_intervals = [150,1200]
# df_fit_1 = df_fit[df_fit["Re"]<=Re_intervals[0]]
# df_fit_2 = df_fit[df_fit["Re"]<=Re_intervals[1]][df_fit["Re"]>Re_intervals[0]]
# df_fit_3 = df_fit[df_fit["Re"]>Re_intervals[1]]
# df_fit_list = [df_fit_1, df_fit_2, df_fit_3]
#### REPLACE YOUT CODE WITH THIS PART:
def create_dfs_list(df, labels):
list_df = []
for label in labels:
df_bin = df[df['Bin'] == label]
list_df.append(df_bin)
return list_df
labels = [1, 2, 3]
df_fit_list = (
df_fit
.assign(
Bin = lambda df_: pd.cut(df_["Re"], bins=[0, 150, 1200, float('inf')], labels=labels)
)
.pipe(create_dfs_list, labels)
)
# ---------------------------------------------
# THE REST OF YOUR CODE
# dict_Coeff = {"C1":[], "C2":[]}
# for df_ in df_fit_list:
# X_fitting = df_["Re"]
# Y_fitting = df_["Y"]
# C, pcov = curve_fit(fitting_function, X_fitting, Y_fitting)
# dict_Coeff_Nu["C1"].append(C_Nu[0])
# dict_Coeff_Nu["C2"].append(C_Nu[1])
# print("Coefficients Nu ", dict_Coeff_Nu)
I hope this helps!
1