I’ve been using random forest regression to calculate the Return On Ad Spend (ROAS) when the user gives the upper bound. My model takes in three input variables: the cost of TV, Radio and Newspaper ads. However, to find the most optimal value, I need to use a for loop to iterate through every dollar, which is time-consuming. Is there a faster method to find the highest y value in my program?
def ROASPrediction(Q,TV,Radio,Newspaper):
rec="Recommended Investment for Best Sales"
y_big=0
x_b=0
y_b=0
z_b=0
for x in range(TV//2,TV):
for y in range(Radio//2,Radio):
for z in range(Newspaper//2,Newspaper):
customer_features =np.array([x,y,z])
customer_features1=customer_features.reshape(1, -1)
#customer_features1 =pd.DataFrame(customer_features)
model_fit1 = joblib.load('/content/drive/MyDrive/LUV BARNWAL/ROAS.joblib')
y_future_pred = model_fit1.predict(customer_features1)
print("y_future_pred", y_future_pred)
if(y_future_pred[0]>=y_big):
y_big=y_future_pred[0]
x_b=x
y_b=y
z_b=z
#y_future_pred1= str(y_future_pred[0]) + "M$"
#y_roas= y_future_pred[0]*1000000 / (TV+Radio+Newspaper)
y_future_pred1= str(y_big) + "M$"
y_roas= y_big*1000000 / (TV+Radio+Newspaper)
x_b1=str(x_b)
y_b1=str(y_b)
z_b1=str(z_b)
y_roas1=str(y_roas) + "%"
return rec, x_b1,y_b1,z_b1,y_future_pred1, y_roas1
And the following code is my Random Forest Model.
df = pd.read_csv('/Advertising.csv')
df.head()
x = df[['TV', 'Radio','Newspaper']]
y = df[['Sales']]
x_train, x_test, y_train, y_test = train_test_split (x, y, test_size=0.20 , random_state=41)
rf_regressor = RandomForestRegressor(n_estimators=100, random_state=42)
rf_regressor.fit(x_train, y_train)
y_pred = rf_regressor.predict(x_test)
And this is the csv file I’m using.
Is there any way to make the ROASPrediction function more efficient so that it does not take like 5 minutes to compute just $30 of TV, Radio and Newspaper?