We are using ADTK model since last year in the python application to detect anomalies.
Since last week, we are observing a significant spike in CPU utilisation, the cause has been identified as python process.
After troubleshooting, we have identified that the ADTK model is causing the issue. We are using other ML models like IForest, ECOD, CBLOF etc. But only ADTK is responsible for this spike.
Please find the current model code for the reference:
data=df.copy()
minutes = 300
df['predict_dt']=pd.to_datetime(df['predict_dt'])
df = df.set_index(['predict_dt']).sort_index()
seasonal_vol = SeasonalAD(c=1.6,side='negative',trend=True)
seasonal_vol.fit(df['predict_count'])
df['anomalies']=seasonal_vol.predict(df['predict_count'])
df2 = pd.DataFrame(columns=['predict_dt', 'hour', 'min','predict_count','label'])
final = datetime.datetime.now(pytz.timezone('America/Los_Angeles')).replace(tzinfo=None)-timedelta(minutes=minutes)
for i in range(len(df)):
if(data['predict_dt'][i] >= final):
if str(df['anomalies'][i]).lower() == "true":
#calling SP for further operation
We are not sure which part is causing the issues.