Device Aware Latency Modeling for NAS

Accuracy alone does not determine the best architecture. Production systems have latency budgets, memory limits, and power constraints. Device-aware NAS incorporates these constraints directly into the search.
Why Latency Modeling Matters
Theoretical complexity (FLOPs) poorly predicts real latency. A model with 2x more FLOPs might run only 1.2x slower due to better memory access patterns. Conversely, memory-bound operations like depthwise convolutions have few FLOPs but high latency on some hardware.
Different hardware has different bottlenecks. A GPU is compute-bound; a mobile CPU is memory-bound. An architecture optimal for GPU may be terrible on mobile. Device-aware NAS searches for architectures optimized for YOUR deployment target.
Latency Modeling Approaches
Lookup tables: Measure latency of each operation type on target hardware. Sum operations for total latency estimate. Fast but ignores operation interactions and memory effects.
Learned predictors: Train a neural network to predict latency from architecture description. More accurate (captures interactions) but requires thousands of real measurements to train.
Direct measurement: Run each candidate on target device. Most accurate but slowest. Use only for final candidates.
Key metric: Latency predictor error should be under 10%. Higher error means NAS wastes compute exploring architectures that will not meet constraints.

💡 Key Takeaways

✓FLOPs poorly predict latency: 2x FLOPs may only be 1.2x slower

✓GPU is compute-bound, mobile CPU is memory-bound: same architecture performs differently

✓Lookup tables are fast but inaccurate; learned predictors need training data

✓Latency predictor error should be under 10% for effective search

📌 Interview Tips

1Interview Tip: Explain why FLOPs are a poor proxy for real latency

2Interview Tip: Describe how to build a latency lookup table for a target device

3Interview Tip: Discuss the data collection process for training a learned latency predictor

← Back to Neural Architecture Search (NAS) Overview