Multi Fidelity Evaluation Strategy in NAS
Evaluating every candidate architecture fully would take years of compute. Multi-fidelity evaluation trades accuracy for speed, using cheaper proxies to estimate final performance.
The Core Insight
Architecture rankings correlate across training budgets. An architecture that is top-10% after 5 epochs is usually top-20% after full training. This correlation allows early stopping: train many architectures briefly, then fully train only promising ones.
Typical speedup: Instead of 100 architectures trained for 100 epochs each (10,000 total epochs), train 1000 architectures for 5 epochs (5,000 epochs), then fully train top 10 (1,000 epochs). Total: 6,000 epochs with better coverage.
Evaluation Strategies
Reduced epochs: Train for 5-20% of full budget. Fast but rankings may shift for architectures that converge slowly.
Reduced data: Train on 10-25% of the dataset. Faster per epoch but may miss architectures that benefit from more data.
Weight sharing: Train a supernetwork containing all candidate architectures, then sample subnetworks. Extremely fast (single training run) but rankings are less reliable due to weight coupling.
Trade-off: More aggressive proxies (fewer epochs, less data) are faster but have lower rank correlation with full training. Typical correlation: 0.7-0.9 for reduced epochs, 0.5-0.7 for weight sharing.