Critical Trade-offs: Model Choice, Serving Strategy, and Cost

Critical Trade-offs
Every architectural decision in image classification involves trade-offs. Understanding these helps you make informed choices rather than blindly following best practices that may not fit your constraints.
Model Size vs Latency
Larger models (ResNet-152, EfficientNet-B7): Higher accuracy, 100-500ms inference, 500MB+ memory. Suitable when accuracy matters more than speed.
Smaller models (MobileNet, EfficientNet-B0): 2-5% lower accuracy, 10-50ms inference, 20-50MB memory. Suitable for real-time applications or edge deployment.
Decision framework: If your accuracy requirement is 95% and a large model achieves 97% while a small model achieves 93%, the large model is necessary. If both exceed 95%, prefer the smaller model for cost savings.
Accuracy vs Cost
GPU inference costs $0.50-2.00 per million images depending on model size and batch efficiency. A 3% accuracy improvement might require 10x compute cost. Calculate the business value of that accuracy gain before committing.
Example: A content moderation system processing 10 billion images/month costs $5,000-20,000 in GPU compute. Upgrading to a model that is 3% more accurate but 5x slower increases cost to $25,000-100,000. Is catching 3% more violations worth $20,000+/month?
Generalization vs Specialization
General classifier: One model handles all categories. Simpler deployment, but accuracy suffers on hard classes.
Specialized classifiers: Separate models for different domains (animals, products, scenes). Higher accuracy on each domain, but complex routing logic and more models to maintain.
Key Trade-off: Specialization improves accuracy 5-10% on hard categories but multiplies operational complexity. Start general, specialize only when accuracy gaps cause real business problems.

💡 Key Takeaways

✓Larger models gain 2-5% accuracy but cost 5-10x more in latency and compute

✓GPU inference costs $0.50-2.00 per million images - calculate business value of accuracy improvements

✓Specialized classifiers improve hard-category accuracy 5-10% but multiply operational complexity

✓Decision framework: if both models meet accuracy threshold, prefer smaller for cost savings

📌 Interview Tips

1Interview Tip: Frame trade-offs in business terms - ask what the cost of a false negative vs false positive is for the specific use case

2Interview Tip: Mention that model selection depends on latency SLA - real-time feeds need small models regardless of accuracy preference

← Back to Image Classification at Scale Overview