I have a very serious concern about my model being trained by autogluon’s automl.
After training the model for preset – ‘high_quality’, I try to predict data for say one new data point. As per my analysis one of the predicted variables should change its predicted value but it does not. In fact the predicted value does not go less than the least value of the actual dependent variable under any circumstance which is very concerning for me as I believe the predicted value should go down.
this is my data:
Date A B C Target D E
4/10/2018 34 190 5 72 31.075 0.040818654
5/10/2018 34 190 5 72 30.23636364 0.040818654
6/11/2018 34 190 5 72 30.575 0.040818654
7/15/2018 34 185 5 72 30.66 0.040818654
8/26/2018 34 185 5 72 30.6902439 0.040818654
9/30/2018 34 185 5 72 30.60487805 0.040818654
10/19/2018 34 180 5 72 30.6 0.040818654
11/12/2018 34 180 5 72 30.50769231 0.040818654
12/24/2018 34 190 5 72 30.5 0.040818654
1/20/2019 34 190 5 72 30.4 0.040818654
2/9/2019 34 190 5 72 30.4 0.040818654
3/21/2019 34 185 5 72 30.3 0.040818654
4/12/2019 34 185 5 69 30.27142857 0.040818654
5/13/2019 34 185 5 66 30.2 0.040818654
6/3/2019 34 185 5 66 30.26666667 0.040818654
7/26/2019 25 185 5 66 30.11842105 0.040818654
8/10/2019 25 185 5 59 30.04666667 0.040818654
9/27/2019 25 185 5 59 30.31111111 0.040818654
10/7/2019 25 185 5 59 30.27407407 0.040818654
11/17/2019 25 185 5 59 30.12222222 0.040818654
12/6/2019 28 185 5 59 30.05185185 0.040894755
1/17/2020 28 185 5 59 29.6972973 0.041351364
2/21/2020 28 185 5 59 29.31891892 0.041731872
3/7/2020 28 185 5 59 29.24444444 0.041894947
4/10/2020 29 185 5 59 29.11875 0.042264583
5/5/2020 29 170 5 59 29.33076923 0.042536374
6/6/2020 29 170 5 59 29.37894737 0.042884267
7/9/2020 29 175 5 59 29.3 0.043243031
8/4/2020 29 175 5 59 29.40243902 0.043525694
9/13/2020 29 175 5 59 29.5 0.04396056
10/19/2020 29 175 5 59 29.47142857 0.04435194
11/15/2020 29 175 5 59 29.45 0.044645474
12/6/2020 29 175 5 59 29.43333333 0.044873779
1/23/2021 29 175 5 56 29.4 0.045395618
2/4/2021 29 175 5 56 29.4 0.045526078
3/22/2021 29 166 5 56 29.4 0.046026174
4/16/2021 29 166 5 56 29.4 0.046297965
4/25/2021 29 166 5 51 29.4 0.04639581
5/16/2021 29 166 5 51 29.4 0.046624114
6/22/2021 29 157 5 51 29.4 0.047026365
7/5/2021 29 157 5 47 29.4 0.047167697
8/1/2021 28 160 5 47 29.4 0.047486411
9/16/2021 28 160 5 47 29.33076923 0.048091805
10/10/2021 28 160 5 47 29.4 0.048407662
11/10/2021 28 158 5 47 29.4 0.048815645
12/9/2021 28 158 5 47 29.4 0.049197306
1/28/2022 28 158 5 47 29.69918699 0.049855343
2/20/2022 31 156 5 47 29.99837398 0.05015804
3/20/2022 28 156 5 43 30.36260163 0.05052654
4/14/2022 28 155 5 39 30.68780488 0.050855558
5/25/2022 28 155 5 39 30.43790323 0.051395148
7/18/2022 28 142 5 39 28.65241935 0.052105828
8/7/2022 28 142 5 39 27.99112903 0.052369042
9/18/2022 28 142 5 39 26.92727273 0.052921793
10/31/2022 28 142 5 39 27.04871795 0.053487704
11/22/2022 28 142 5 39 27.09583333 0.05377724
12/30/2022 28 146 5.5 39 27.01666667 0.054277348
1/15/2023 28 146 5.5 39 26.94285714 0.05448792
2/1/2023 28 146 5 39 26.82142857 0.054711652
2/15/2023 28 146 5.5 39 26.91 0.054895903
3/3/2023 28 145 5.5 39 27.07 0.055106474
4/9/2023 28 145 5.5 39 27.99230769 0.055593421
5/9/2023 28 145 5.5 39 28.05 0.055988243
6/2/2023 28 145 5 39 28.6952381 0.056304101
6/28/2023 29 130 5.5 39 30.8 0.05664628
7/1/2023 29 130 5.5 39 30.74 0.056685762
7/16/2023 29 130 5.5 39 30.61612903 0.056883173
8/9/2023 29 130 5.5 39 30.24285714 0.05719903
9/25/2023 29 140 5.5 39 30.48888889 0.057817585
9/26/2023 29 140 5.5 39 30.47777778 0.057830745
9/27/2023 29 140 5.5 39 30.46666667 0.057843906
9/28/2023 29 140 5.5 39 30.45555556 0.057857067
10/28/2023 29 105 5.5 39 30.25294118 0.058251889
11/14/2023 29 105 5.5 39 30.34545455 0.058475621
12/6/2023 29 105 5.5 39 30.4 0.058765157
1/24/2024 29 105 5.5 39 30.4 0.058765157
I have trained autogluon’s Tabular predictor to predict the Target variable with experiment type being Regression. I get a very low MAPE. But when I try to use a data point in order to do an inference on the model, the output never goes below 39. In fact even if I make all the inputs 0, the output gives me a value of 46 which is beyond my capability to understand.
Training example:
predictor = TabularPredictor(label='Target', problem_type='regression', eval_metric='mean_absolute_percentage_error').fit(train_df, presets='high_quality', time_limit=300)
Inference sample:
new_data_point = {
'A': [20],
'B': [90],
'C': [5.0],
'D': [20],
'E': [0.004]
}
new_data_df = pd.DataFrame(new_data_point)
predictor.predict(new_data_df) gives output:
0 51.603809
Name: Oil Rate, dtype: float32
this output hardly ever changes no matter how much I change the data point for inference.
Why if the output of the predictor.predict(new_data_df) not changing even though I keep changing the new_data_df values?