Choosing Erasure Coding Schemes: k, p, and Stripe Geometry
Understanding k and p Parameters
An erasure coding scheme is defined by k (data fragments) and p (parity fragments). A k+p scheme stores k+p total fragments and tolerates losing any p of them. Storage overhead is (k+p)/k. A 10+4 scheme has 14/10 = 1.4x overhead. A 6+3 scheme has 9/6 = 1.5x overhead.
Higher k means more efficient storage but requires more nodes for a single stripe. If you only have 10 servers, a 10+4 scheme requires all servers to participate in every stripe. A 4+2 scheme can stripe across different 6 server subsets, improving fault isolation.
Stripe Size Geometry
Each fragment has a size, and the stripe size equals fragment size times k. A 10+4 scheme with 1MB fragments has 10MB stripes. Objects smaller than 10MB waste space or require padding.
Larger fragments improve encoding efficiency (less overhead per byte) but increase minimum object size and repair bandwidth. Smaller fragments reduce waste for small objects but add per fragment metadata overhead. Common production configurations use 1-4MB fragments.
Matching Scheme to Workload
Archive storage: Maximize k for efficiency. 16+4 gives 1.25x overhead. Objects are large (gigabytes to terabytes), accessed rarely, latency insensitive. Repair bandwidth is acceptable because reads are infrequent.
Hot object storage: Balance k and p. 10+4 or 8+4 provide good efficiency with reasonable repair bandwidth. Objects range from megabytes to gigabytes with frequent reads.
Small cluster: Reduce k to match available nodes. 4+2 or 6+2 work with 6-8 nodes. Lower efficiency but avoids requiring all nodes for every stripe.
Durability Target Calculation
Given p failures tolerated and per disk failure probability f, durability is roughly 1 - C(k+p, p+1) * f^(p+1). For 10+4 with 1% disk failure: losing 5+ of 14 fragments has probability roughly 2000 * (0.01)^5 = 2 * 10^-8. That is eight nines, not eleven. The eleven nines assumes lower disk failure rates and independent failures.