Compression Test

Random data is incompressible. That's not a metaphor — it's a theorem. If the Powerball lottery is genuinely random, bit-packing every draw and running gzip should barely shrink it at all. If there's a pattern, gzip finds it.

gzip compression ratio

95.9%

crypto reference: 96.0%  ·  -0.01 pp — within expected noise

Bytes after each step · real draws vs crypto reference

Real draws (raw bit-packed)6,695 B

Real draws (after gzip)6,423 B

Crypto reference (after gzip)6,424 B

Theoretical minimum4707 B

Draws
1,339
Raw size
6,695 B
40 bits × draws + padding
Gzip size
6,423 B
95.94% of raw
vs theoretical min
1.365×
+1716 B of overhead
Crypto reference ratio
95.95%
6,695 → 6,424 B
Real − reference
-0.01 pp
within noise

How it works

The encoding. Each draw collapses to 40 bits: five 7-bit whites (since 69 fits in 7 bits) plus a 5-bit Powerball (since 26 fits in 5). This is efficient but not optimal — the true information content is closer to 28.12 bits per draw (log₂(C(69,5)) + log₂(26)). The extra 12 bits per draw are alignment waste that a good compressor should be able to squeeze out.

The compressor. gzip with maximum compression level. Operates on a sliding window with LZ77 plus Huffman coding — it finds repeated byte sequences and encodes common symbols with shorter codes. Powerless against true randomness, devastating against patterns.

The null-hypothesis reference. To calibrate the real result, we synthesize the same number of “draws” using the OS cryptographic RNG (crypto.getRandomValues), bit-pack them identically, and gzip. The real-draw ratio should match the reference ratio to within noise. If it's meaningfully smaller, the real draws have compressible structure the synthetic stream doesn't.

How to read it. A gzip ratio near 1.00 means “as incompressible as random.” The theoretical minimum floor tells you how many bytes an ideal compressor would need; gzip rarely hits the floor exactly because it has its own framing overhead (~18 bytes) plus inherent coding inefficiency. Match that floor within a few percent and you're looking at the null hypothesis.