Emission search based on Isolation Forest emission search methods with network data. Why is the accuracy so low?
I wrote a code that loads data from a CIC-IDS dataframe with network traffic data, containing 79 signs and 1 column with labels (BENING is the norm, the rest is an outlier, an attack). Based on this, it is necessary to train the Isolation Forest model so that it can determine emissions. The code is written, but it works poorly. The accuracy is not very good, and the completeness and F1 measure are terrible. If there are those who know very well about this, I will be very grateful. The code is big, but suddenly someone will be found. I can’t figure out exactly where I was wrong.