Whoa, those VRAM findings are definitely interesting.
Just to get one thing out of the way: during your testing, you always used the same base address, right? RAM at 0xA0000000 ... 0xA1FFFFFF is not cached, 0x80000000 ... 0x81FFFFFF is cached. You can access the VRAM at 0xA0000000 or 0x80000000, but obviously only the latter one will be cached. If you switch between those two randomly of course results won't be consistent, but I'm sure you took care of that.
AHelper looked into the MMU, but his calc died in the process. DMA only working from lower RAM addresses could be explained with the DMA engine having a smaller number of address lines or something like that.
The bootloader does some address writes that are unexplained and could have to do with this, or not, who knows.
The slowness could be due to a different cache policy, or it could be because while the add-in stack is virtualized, the VRAM is not (but then I would expect the VRAM writes to be faster due to not having to be address-translated, not the other way around...).
I think that for overclocking you'd need to dive in the Ptune2 source code, yes. Nobody ever got around to documenting that nicely. This page:
http://prizm.cemetech.net/index.php/CPU_Clocks documents the various clocks (at least some of them?) but doesn't describe how to change them, and the page it links to, Clock Pulse Generator, is incomplete.
Ideally you'd also save the previous clocks and restore them on exit (so that if the calculator was already overclocked, or underclocked, the previous state is restored).
About the TMU, the OS uses it in at least one syscall:
http://prizm.cemetech.net/index.php/OS_InnerWait_ms
I'd be careful about assuming the OS never uses something. Here are some of the things people sometimes forget to take into account, and which can't be tested in the emulator:
- USB connection (including the mere act of plugging in a USB cable while the add-in is executing, plus everything that can happen from there, including entering mass storage mode and the other less used modes). The popup might appear only when GetKey is called, but I'm not sure.
- 3-pin connection (I believe that the Prizm can be configured to automatically accept file transfers and the like, which also cuts off add-in execution rather haphazardly). Again, maybe this is only when GetKey is used.
- Going into standby (including taking into account that the auto-power-off timer exists and might run only when GetKey is called, but better be safe than sorry). This is interesting because it involves writing a lot of data to flash (and you definitely don't want to mess with the stack of those routines while they run, or the RS memory in general, that's how I bricked my first Prizm main board) and the X, Y and IL memory contents are lost (and RS is used to hold code for resuming from standby).
Lots of things in the OS appear to revolve around GetKey (
http://prizm.cemetech.net/index.php/GetKey ), which means you can more or less get the OS out of your way if you don't call it. But other syscalls, timers and interrupts are still a thing. It's kind of hard to be sure one has full control over the program execution at any point.
Sigh, things could be so much easier and exciting if the bootloader was in a mask ROM or at least in a write-protected sector. We would be able to write and ship less-than-bug-free code and even if it somehow ended up damaging the OS, it would always be possible to recover...