Update
I endeavored to add a timing consistency test to the array of unit/module tests for lwIP. The test is under
tests/tls_timing.
https://github.com/cagscalclabs/lwip-ce/actions
It checks all primitives or modules that require constant time. This is:
random_bytes, secure_compare_bytes, SHA-256 (& HMAC), AES-CBC, AES-GCM, HKDF, PBKDF2, and x25519. RSA is currently omitted as I try to figure out what vectors it would need.
This test takes ~30 minutes to run via the cemu-autotester and would undoubtedly take hours to run on a calculator; that being said I encourage people to run it on their calculators if you can spare the time. I want to see:
1. If it fails a specific test category for anyone
2. What the averages are for specific tests
For #2, build the test with with the -DTLS_TIMING_VERBOSE compiler flag. That will make it print stability and differential data so I can review to confirm if our thresholds (particularly for differentials) are sane. Please post the verbose output to this thread, alongside your hardware revision and OS version, if you run it that way.
What do the two tests mean?
We test timing in two categories: stability, which is meant to capture jitter, and differential, which is meant to capture differences in timing based on a class of input.
For the stability test: Identical inputs are executed repeatedly. The primitive passes if the standard deviation of CPU cycle counts remains below 1% of the mean runtime. Any primitive exceeding this threshold fails.
For the differential test: Inputs are grouped into the following structural classes: best-case, worst-case, random, edge-case. Each class is executed multiple times. If a single class exhibits a statistically consistent runtime deviation from the others (>=75% of runs trending higher in the same direction beyond configured cycle thresholds), the primitive fails. The current thresholds are printed in the job summary for the test on github.