Linker for eZ80 calcs

DrDnar · Expert (Posts: 512)

So I'm currently working on input arguments parsing. Hopefully, then I can finally get back to the linking logic.

Lionel Debroux · Minor Calculator Deity (Posts: 1317)

Hmm. Forgive my potentially unwanted interference, all the more I'm largely an outsider to the TI-eZ80 development community, but I'd like to tell a related story, from the TI community, where a pretty similar line of thinking yielded a suboptimal outcome Smile

About a decade ago, in the "new", optimizing linker for the TI-68k development environment, all data structures were O(n) linked lists, instead of O(log n) trees or O(1) amortized hash tables. The main reason for that choice given by the linker's makers was coding simplicity, and low enough complexity of the usual programs.
As a matter of fact, from a user's POV, the efficiency of linked lists remained acceptable, even on then-contemporary computers, for the usual <64 KB programs: C compiling time is usually larger than link time, which is usually under a second anyway, so shaving several hundreds of ms over link time wouldn't make builds that much faster. The data structure choice was therefore arguably suboptimal, but quite far from irresponsible, for the linker's main use case.
However, the story was quite different for "large" programs made from "many" sections and symbols. PpHd reported insane times - a dozen minutes ! - linking the full version of PedroM, with PpHd's math stuff, which was several hundreds of KBs. While attempting to optimize the program (reordering sections, cutting ranges - from a user's POV, the set of optimizations is nifty), the linker was essentially spending its time traversing its linked lists. Obviously, a '2016 i7-6700HQ or i7-6700K would yield better wall clock times than PpHd's computer, which already wasn't run-of-the-mill back then - but still well in the minutes range, which is not really acceptable Smile

A more complex version of christop's Punix, or FlashApps (had the TI-68k FlashApp support ever been added to the linker, it never was due to abysmally low number of current and potential users), would have hit the same issue.

And yet, that linker is written in C, without external dependencies, rather than in:
* one of the higher-level languages not usually going through AoT compilation to native code which existed at the time, mainly Java and the .Net family - neither of which has a lightweight execution environment preinstalled on (any, most) of the platforms targeted by the linker;
* obviously, the 2010s languages compiled to native code but aimed at better safety: mainly Go, Swift (still not suitably portable), Rust - which were not even in the works at the time, at least for the latter two.

Switching to a RB-tree for the data structure(s) where it mattered is on GCC4TI's todo/wish list, and I had mostly chosen a GPLv2'ed RB-tree license for that purpose - IIRC, the Linux kernel's. I didn't go further, because the todo/wish list contained, and partially still contains, other higher-profile (and easier) bugfixes and improvements, and also (mainly ?), shortly thereafter, I became the libti*/gfm/tilp maintainer. Those were not really my favorite projects to work on, I'd have found it more fun to work on GCC4TI, whose code base I knew better anyway. However, libti*/gfm/tilp are useful to more people than a narrow set of TI-68k programmers, and remain useful, to date...

TL;DR: I think you should attempt to avoid data structures (and programming languages + execution environments, but you've already partially covered that by presumably not using a scripting, dynamically typed, non-JIT'ed language on top of the .Net platform) which could predictably hamper the making of large programs, such as third-party OS or FlashApps (not that TI will let us make them officially, but it's not like we should pay attention to their wishes), for the TI-eZ80 series Smile

DrDnar · Expert (Posts: 512)

It's so easy to write a lot of code like that. And naturally, refactoring that to use trees would be a major PITA.

But I'm the post child for abstraction, and I don't have to write my own data structures. .NET's Dictionary class is hash-based. So maybe it'll eat 50 MB of RAM to link a 64 K program. I'd prefer to spend my time debugging a linker, not reinventing the same wheel that's been invented a dozen times before.

Maybe, once it's working well, a C++ port would be in order. At the moment, Zilog's compiler is already Windows-only, which comes with the .NET runtime (unless you're using something terribly outdated). I happen to be more familiar with C# than The Mother of All Memory Leaks. So tell you what, I'll do a C++ port if LLVM ever manages to generate valid eZ80 code. In the mean time, I'm going to focus on writing a program that does a task whose purpose most programmers can't even recall.

EDIT: I didn't mean for that to sound a little a flame-y. My point is simply that I feel it's too early to worry about optimization, and that with good structure, refactoring code is easy.