Chunking Strategies: Fixed vs Semantic
Fixed Length Chunking
This approach splits documents every N tokens regardless of content structure. For example, a 5,000 token legal contract becomes exactly 10 chunks of 500 tokens each. The implementation is trivial: tokenize the document, group into fixed size arrays, optionally add 10 to 30 percent overlap between adjacent chunks. The advantage is speed and predictability. At ingestion throughput of 100 million log entries per day, fixed chunking processes documents in microseconds with zero parsing complexity. Token budgeting is trivial because every chunk has identical size. However, you frequently cut across semantic boundaries: a table might be split so headers land in one chunk and data rows in another, or a legal definition might be separated from the clause that references it.
Semantic Chunking
This approach respects document structure by splitting on natural boundaries: section headers, paragraph breaks, or embedding based topic shifts. A design doc with 5 sections becomes 5 chunks of varying size (200 to 1,200 tokens). Some systems use a small language model to detect when the next paragraph shifts topics based on embedding distance. Semantic chunking typically improves answer quality by 5 to 15 percent in evaluations because chunks are self contained and coherent. A compliance policy chunk will include both the rule and its exceptions. However, variable chunk sizes complicate context budgeting: you might plan for 20 chunks but only fit 12 because several are unusually large. Ingestion is also 50x to 100x slower due to parsing overhead.