Computer Vision SystemsImage Preprocessing (Augmentation, Normalization)Easy⏱️ ~3 min

Normalization and Input Standardization

Why Normalization Matters

Raw pixel values range from 0-255. Neural networks learn faster and more stably when inputs have zero mean and unit variance. Without normalization, gradients explode or vanish, training becomes unstable, and convergence takes 2-10x longer.

Standard Normalization Approaches

Simple scaling: Divide pixels by 255 to get 0-1 range. Fast but suboptimal because the distribution is not centered.

Mean subtraction: Subtract dataset mean from each pixel. Centers the distribution around zero. Compute mean once on training data and apply to all images.

Full standardization: Subtract mean and divide by standard deviation. Produces zero mean and unit variance. This is the standard approach for pretrained models.

Pretrained Model Requirements

Pretrained models expect specific normalization. ImageNet models expect RGB values normalized with mean=[0.485, 0.456, 0.406] and std=[0.229, 0.224, 0.225]. Using wrong normalization with pretrained weights degrades accuracy by 10-30%.

Critical rule: Always use the same normalization for training and inference. If you train with ImageNet normalization, you must apply ImageNet normalization at serving time. Mismatches cause silent accuracy drops.

Channel Order and Color Space

RGB vs BGR: Some frameworks use RGB, others use BGR. OpenCV loads images in BGR by default. Mixing up channel order flips red and blue, causing accuracy drops of 5-15%.

Grayscale conversion: For models expecting single-channel input, convert using the luminance formula: Y = 0.299R + 0.587G + 0.114B. Simple averaging produces inferior results.

💡 Key Takeaways
Normalization speeds training 2-10x by preventing gradient explosion and vanishing
Pretrained ImageNet models require specific normalization: mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
Training and inference must use identical normalization - mismatches cause silent accuracy drops of 10-30%
RGB vs BGR channel order matters - mixing them up drops accuracy 5-15%
📌 Interview Tips
1Interview Tip: Mention training-serving skew from normalization mismatch as a common production bug
2Interview Tip: Explain that pretrained model documentation specifies required normalization - always check before fine-tuning
← Back to Image Preprocessing (Augmentation, Normalization) Overview