ML Model Optimization • Neural Architecture Search (NAS)Medium⏱️ ~3 min
What is Neural Architecture Search (NAS)?
Neural Architecture Search (NAS) automates the design of neural network architectures by searching over a space of candidate models rather than relying on human experts to manually compose layers and connections. The approach rests on three fundamental pillars: the search space defines what architectures are possible (building blocks, connections, topology patterns), the search strategy explores this space (using reinforcement learning, evolutionary algorithms, Bayesian optimization, or differentiable methods), and the evaluation strategy estimates how good each candidate is (through early stopping, training on data subsets, weight sharing in supernets, or performance prediction from learning curves).
The original challenge was cost. Early reinforcement learning based searches like NASNet consumed roughly 1800 GPU days for a single run. Modern methods using weight sharing and differentiable search reduced this to 1 to 4 GPU days on standard benchmarks. The real value emerges when NAS optimizes multiple objectives simultaneously: accuracy, inference latency on target devices, model size, and memory footprint. This transforms NAS from pure modeling into system design.
In production, NAS becomes a model factory with hard constraints. Google's MnasNet search targeted Pixel phones with specific requirements: median on device latency under 80 milliseconds, 95th percentile under 120 milliseconds, model size under 20 megabytes, all while maintaining top 1 accuracy above 75 percent on ImageNet scale data. Meta's FBNet series used differentiable NAS for Instagram and augmented reality workloads, achieving 1.5 to 1.6 times speedups on mobile CPUs. Nvidia's hardware aware NAS targets Jetson devices with constraints like 30 frames per second under 10 watt power budgets and sub 100 millisecond inference.
💡 Key Takeaways
•NAS automates neural network design through three pillars: search space (what is allowed), search strategy (how to explore), and evaluation strategy (how to measure quality)
•Cost reduction from 1800 GPU days (NASNet with RL) to 1 to 4 GPU days using weight sharing and differentiable methods like DARTS
•Multi objective optimization targets accuracy, device latency, model size, and memory footprint simultaneously, making NAS a system design problem
•Production deployments require hard constraints: MnasNet targeted 80ms median latency on Pixel phones with 75% ImageNet accuracy and under 20MB size
•Real speedups achieved: Google MnasNet delivered 1.5x faster inference than MobileNetV2, Meta FBNet achieved 1.5 to 1.6x speedups on mobile CPUs for Instagram workloads
📌 Examples
Google MnasNet: Device aware search on Pixel phones achieved 70 to 100ms inference latency for image classification with 1.5x speedup over MobileNetV2
Meta FBNet: Differentiable NAS with latency aware objectives, 1.5 to 1.6x speedup on mobile CPUs for Instagram and AR applications
Nvidia Jetson NAS: Hardware aware search targeting 30 FPS under 10 watt power budget with sub 100ms inference per frame for vision models