NAS Failure Modes and Production Mitigations
NAS can fail in expensive ways. Understanding these failure modes helps you avoid wasted compute and misleading results.
Search Space Mismatch
The most common failure: defining a search space that does not contain good architectures. If you only search over 3x3 convolutions but the task needs larger receptive fields, no amount of search will find a good solution.
Fix: Start with a search space based on known good architectures for your task. Include operations that have worked before. Expand cautiously rather than starting maximally broad.
Proxy Task Divergence
NAS often searches on a simpler proxy task (smaller dataset, fewer classes) then transfers to the full task. If proxy and target tasks have different optimal architectures, search finds the wrong answer.
Signs of divergence: top architectures from search underperform hand-designed baselines on full task. Fix: Validate on a held-out subset of the full task during search.
Evaluation Noise
Training is stochastic. The same architecture trained twice may differ by 0.5-1% accuracy due to random initialization and data ordering. If your multi-fidelity proxy has high variance, NAS picks lucky runs, not good architectures.
Fix: Average multiple runs per architecture (2-3 minimum). Use deterministic data ordering. Accept that very small accuracy differences are not meaningful.