What is Embedding Generation and Why It Matters
WHY EMBEDDINGS EXIST
Raw inputs are hard to compare. How similar are "cheap flights to Paris" and "affordable plane tickets to France"? String matching fails. Embeddings solve this: both sentences map to nearby vectors, and vector distance measures semantic similarity.
The key property: similar inputs produce similar vectors. If you train embeddings on search clicks, queries that lead to the same results will cluster together even with different words.
TYPES OF EMBEDDINGS
Text embeddings: Neural networks (BERT, Sentence-BERT) encode sentences into 384-768 dim vectors. Inference: 10-50ms on GPU.
Image embeddings: CNNs or Vision Transformers encode images into 512-2048 dim vectors. Used for visual similarity search.
Graph embeddings: Encode user-item interactions into vectors. Capture collaborative signals (users who click similar items have similar embeddings).
THE EMBEDDING PIPELINE
Training: collect pairs of similar items (clicks, purchases), train model to make their embeddings close. Inference: encode new items, store vectors, use ANN search to find similar ones.