Size: 13602 bytes
138 bytes for decompression code
64 bytes for Huffman tables
13400 bytes of compressed data stream
Time: Under 500 ms to decompress; as fast as the current code to query
I was not sure if the Huffman tables should count as part of the decompressor or data, so I listed them separately. They are themselves compressed & contain, for both children of each non-leaf tree node, either a single 1 bit if the child is a not a leaf, or a 0 bit followed by bits 4-0 of the leaf's value.
The compressed data stream contains 26 level-1 blocks.
Each level-1 block contains 26 level-2 blocks unconditionally, since every letter can start a word.
Each level-2 block contains 26 level-3 blocks unconditionally, because this was cheaper.
Each level-3 block contains one or more Huffman-encoded differences, each of which is followed by a level-4 block unless the running total (which starts at 0x40) exceeds 0x5A.
Each level-4 block contains one or more Huffman-encoded differences, each of which is followed by a level-5 block unless the running total (which starts at 0x40) exceeds 0x5A.
Each level-5 block contains one or more Huffman-encoded letters, each of which is followed by a continuation bit (1 for non-final letters & 0 for the final letter).
Executing the decompressor 20 times takes between 6 & 7 seconds, so the time to decompress is somewhere between 0.3 & 0.35 seconds & thus in particular less than 500 ms. Checking if a word is in the decompressed list can be done as in the current program, because the output is the list of words (with no null terminators).
Concatenate the following code block with the subsequent paste to get the full test program. It was written in eZ80 machine language.
# Entry point for decompressor (D1A881)
# The DE load needs 147 bytes at a 256-byte-aligned address
F3 0E01 110032D0 210BA9D1 D9 01819300
CDF1A8D1 79 380A 3E08
CDF1A8D1 17 30F9 0D
0C D9 12 13 D9 10E8
# This HL load sets the output location
AF CDFDA8D1 82 FE5B 3029 57 1E40
3E15 CDFDA8D1 83 FE5B 30E8 5F
3E2F CDFDA8D1 F640
70 23 71 23 72 23 73 23 77 23
CDF1A8D1 38E8 18DA
3E5B 0C B9 20C5 04 B8 20BF C9
# Fetch a bit from the compressed stream (D1A8F1)
D9 0D 2004 46 23 0E08 CB10 D9 C9
# Decode a Huffman-encoded letter (D1A8FD)
CDF1A8D1 8F D9 5F 1A D9 CB7F 20F3 C9
# Huffman table data (D1A90B)
Compressed data stream (D1A94B): https://pastebin.com/g7xYJPvT
EDIT: Improved data encoding (14536→13622)
EDIT: Improved Huffman table encoding & made slightly faster (13622→13602)