Several years ago I enabled CE C programming via your browser in SourceCoder, and somewhat earlier than that enabled Z80 assembly via Spasm running on your device. This is all great for most users, but I'm generally displeased with needing to run the C compiler for CE programming on our server since it means more infrastructure to maintain and have wished for a way to run the compiler directly in users' browsers instead.

Recently, Whitequark's "clang as a 25MB pure function" got me thinking about running a CE C compiler in the browser again, since the C compiler we use to target the CE these days is clang. I then realized the greater annoyance for the CE C tooling, which is that although clang is used to generate assembly from C code, fasmg is still used to assemble and link programs: so even given a version of clang that can run in your browser you still need an x86 machine to build a program because fasmg only runs on x86 (it's implemented in x86 assembly).

Given the limitations of fasmg and a complete lack of interest in attempting to reimplement fasmg in a portable way, my thoughts turned to emulation as a solution. Many desktop users these days will already by using fasmg via emulation (that is, Mac users with ARM CPUs running the x86 binary via "Rosetta") so it's hardly new ground, but performance may be somewhat lacking. I further recalled learning of v86, a full x86 PC emulator in the browser, some time ago and figured it would make a decent base for CE C tooling entirely in the browser.

Leveraging my previous experience using Buildroot to create Linux-based computing appliances, experience building LLVM for various needs and the previously-noted compiler server that currently sits behind SourceCoder, I present a proof-of-concept CE C compiler that runs entirely in your browser:

https://tari.github.io/csdk-appliance/


The way this works is that I've built a minimal 32-bit x86 Linux system that contains ez80-clang and the other LLVM tools used by the CE tooling, the CE-specific tools such as convbin and cedev-config, and fasmg. This is all packed into a read-only squashfs root filesystem to minimize the size of the system image, which currently weighs in at about 60MB (2MB for the kernel, and 58MB for the squashfs). Once the VM image has been downloaded, I've measured that it boots up and is ready to build things in about 6 seconds which seems good. Actual build performance is less good but probably tolerable, where the "Hello World" sample compiles in about 11 seconds on my machine.

To ask for builds to run and get output, a fairly simple Python application communicates with the web page over an emulated serial connection. It handles requests from the web page to start a build in a chosen directory, and streams the output back (so you can see error messages and the like). Source code and other files are exchanged through 9pfs over virtio, so files live entirely in the web page where the virtual machine is able to read and write via 9P protocol.

This isn't totally useful yet since I haven't implemented a way to download the resulting program binary when you run a build, but that would be relatively straightforward. I'm more interested in integrating it into SourceCoder so our compiler server can be decommissioned, but haven't prioritized any work on that and expect it to be somewhat annoying to do because the tooling to build our client-side applications here on Cemetech is quite dated (much of it predates Webpack!).

It would also be nice to see if the performance can be improved, but I suspect most of the slow build time is just because emulating x86 in webassembly is never going to be very fast (without heroic optimizations, anyway). It might be that giving the VM more RAM or structuring the filesystem image differently (maybe using EROFS rather than squashfs?) could help save on time spent decompressing data on disk, though a quick experiment I've already done didn't seem promising. Another very dumb option would be to do distributed compilation, since v86 doesn't support multiprocessor VMs but I could boot multiple VMs in separate web workers and distribute parts of the build between them. I rather suspect most speed improvements would need to trade off against either download size or memory requirements (the VM currently gets 128MB of RAM, and more than that feels rather large), however.
This is amazing work. On my machine/browser, the same hello world program takes 7 seconds to compile.

Once there is a button to download the binary, and then packaging all of this into a single .html file (base64-encoded squashfs inline with the js?, resulting around 80MB total), this could be a really nice stand-alone application: Just open the .html file in your browser, copy in your source code, and compile.
(Sorry if this is a little off topic)
If we are talking about upgrading source coder then consider these:

  • Adding app signing to the monochrome calcs' z80 editor
  • Adding Axe Tokens to the program editor
  • (sorry if this is a bit much but) adding a disassembly exporter
Tari wrote:
[url=https://www.cemetech.net/forum/viewtopic.php?t=12505]
It might be that giving the VM more RAM or structuring the filesystem image differently (maybe using EROFS rather than squashfs?) could help save on time spent decompressing data on disk, though a quick experiment I've already done didn't seem promising..


Unless your VM has a shortage of RAM (and it is thrashing as a result [see footnote]) is unlikely that speed of data decompression is the cause of slowness. On a modern processor 60Mbytes can be decompressed in about 0.2 - 0.5 seconds including overhead.

Out of the box Mksquashfs defaults favour compression over read speed and EROFS defaults favour read speed over compression, but the difference is insignificant because the overall cost of decompression here is less than 5% of the time.

FYI Mksquashfs by default uses the gzip algorithm where on a modern processor the kernel decompressor runs at about 300 MBytes/sec. You can use the lz4 algorithm which doesn't compress as well, but it decompresses at about 1000 Mbytes/sec. Lastly the xz algorithm compresses well, but it only decompresses at about 100 Mbytes/sec. The best overall algorithm is zstd which compresses well (worse than xz but better than the others) and decompresses well (worse than lz4/lzo but better than the others) at about 950 Mbytes/sec.

To use zstd compression:

% mksquashfs directory image -comp zstd

[foot note] in general the amount of RAM the VM has should be enough to fit all the program executable and data sections into memory (what is called the working set). If that happens then the programs will run without constantly demand loading pages from disk. If the amount of RAM is too small to contain the working set, then pages will be constantly dropped from the cache and reloaded again (and again). If it happens excessively the system will trash where it will do nothing but constantly reload pages from disk.
Also would it not be less work and faster to just rewrite fasmg or just port spasm-ng to javascript
phil_lougher wrote:
On a modern processor 60Mbytes can be decompressed in about 0.2 - 0.5 seconds including overhead.

...

To use zstd compression:

I'm already using zstd, but haven't spent any effort do to real performance measurements. Since this is sitting behind two layers of interpreters/emulation (v86 emulation and wasm) I expect decompression to be noticeably slower than native code, but zstd is probably still the best option.

My theory with testing EROFS was that the overhead of calling into the hypervisor might be significant so allowing it to do larger block reads might help, but that resulted in a significantly larger image with no noticeable performance effect.

I also tried doubling the RAM which didn't seem to improve anything either. That's not too surprising, even though the full process image of clang is probably around 50MB (based on the combined size of libclang and libLLVM). I suspect much of the library code isn't actually used so never gets loaded anyway.

ti_kid wrote:
Also would it not be less work and faster to just rewrite fasmg or just port spasm-ng to javascript
No. Writing an assembler would be a lot more work than the couple weekends this took, and then I'd be on the hook to maintain it. Much easier to emulate software that somebody else maintains. spasm-ng also isn't relevant, since the CE tooling uses fasmg as a linker which isn't possible with spasm.
Wait I’m not sure if this would be faster but why not just emulate dos and the 8086 instruction set instead of all of x86
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 1
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement