How Does Isolation Forest Work?
The Algorithm
Build a forest of random trees (typically 100-256 trees). For each tree: randomly sample a subset of data, then recursively partition by picking a random feature and random split point within the feature range. Stop when each point is isolated or max depth is reached.
An outlier sitting far from the data cluster gets isolated in 2-3 splits. A normal point deep in a dense cluster requires 10-15 splits. The path length (number of splits to isolate) becomes the anomaly score.
Scoring
Average the path length across all trees. Normalize by the expected path length for a dataset of that size (approximately 2 × (ln(n-1) + 0.5772) - 2(n-1)/n for n samples). Score near 1 means anomaly, near 0.5 means normal, near 0 means the point is in a very dense region.
Why It Works
Random splits are inefficient at separating tightly clustered points but efficient at separating outliers. Outliers have more empty space around them, so any random split has good odds of isolating them. This property makes Isolation Forest robust to the curse of dimensionality that plagues distance-based methods.