Core Concept
Building content-based systems in production requires careful feature engineering, efficient indexing, and real-time profile updates. This card covers practical implementation patterns.
Content Feature Engineering
Text features: Use pretrained language models (BERT, sentence transformers) to embed titles, descriptions, and reviews. Typical dimension: 384-768. Compute once per item, store in feature store.
Categorical features: Embed categories, brands, and tags. Use learned embeddings (32-64 dimensions) trained on click data. Multiple categories become averaged or concatenated embeddings.
Image features: Use pretrained vision models (ResNet, CLIP) to extract visual embeddings from product images. Useful for fashion, furniture, and visual search.
User Profile Updates
User profiles must update in near real-time. If user clicks an item, that item features should influence their profile within seconds. Common pattern: maintain a sliding window of recent interactions (last 50 items) and compute profile as weighted average of their embeddings. Weight by recency: items from today get 2x weight versus items from last week.
Store profiles in a fast key-value store (Redis, Memcached). Profile update is a write on every interaction. Profile read is on every recommendation request. Size per user: typically 512-2048 bytes for embedding plus metadata.
💡 Key Insight: Content features are expensive to compute but cheap to serve. Precompute item embeddings in batch. User profile updates are cheap (averaging) but must be fast. Design your system around these asymmetries.
✓Offline: extract multi modal features, train CF on interactions, build ANN indices with quantization (100M items at 256 dims drops from 102 GB float32 to under 10 to 20 GB with product quantization), recompute daily with streaming hot updates
✓Online: construct user profile as recency weighted sum with 7 to 14 day exponential decay and interaction weights, retrieve from multiple indices (CBF top 500 to 5,000, CF top 1,000 to 10,000) in 5 to 30ms P95 each
✓Re rank 200 to 1,000 merged candidates with learned ranker using similarity scores, recency, popularity, diversity features in 50 to 150ms P95, then apply post rank constraints for policy, safety, deduplication
✓Weighted blending learns context conditional weights via calibration models refreshed weekly: Score = w_cf × s_cf + w_cb × s_cb + w_pop × s_pop, with higher content weight for new items and CF weight for established
✓Evaluation combines offline metrics (Recall at k and NDCG at k for cold start slices) with online A/B tests (click through, watch time, conversion) plus guardrails (diversity, latency P95 and P99, policy violations)