Let say, I have the anomaly detection (unsupervised learning) dataset with 10 observations (two features). The datasets is like below:
After executing the model, following are the results (anomalies scores):
The observations which have score as -1 are anomalies.
I have a question about isolation forest or I may not understand how it works correctly.
- How isolation forest learns(loss function) while training the data? How it make sure that anomalies detected by model are actually the anomalies?
For example – in above results the observation got scores as -1 could not be the anomalies then how we make sure the results are accurate because in case of unsupervised learning we don’t have any target variable means there is no way to calculate errors. So, what would be minimized?
Can anyone please explain to me how it works and provide an example?