I believe that I have finally figured out the deal with relocation. Smile

So, the structure of libraries will look like this: (Small sample library)

NOTE: As of right now, this will only support libraries written in SPASM. However, since ZDS also spits out the assembly source, you could technically use C code in a library as well, I guess.


Code:
 libname("SAMPLELIB")
 libvers(1)
 numreloc(2)
 
 lib_begin()

function("CE_set8bpp")
_set8bpp:
 ld a,$27
_:
 ld (mpLcdCtrl),a
 ret
function("CE_set16bpp")
_set16bpp:
 ld a,$2D
 jr -_

function("CE_setHLTo259")
_setHLTo259:
relocate()
 call ldhl
relocate()
 jp ldhl

ldhl:
 ld hl,259
 ret

 lib_end()


function(string) is part of a collection of macros that builds a jump table file that looks like this:


Code:
 db "SAMPLELIB",0
 db 1
_CE_set8bpp:
 jp 0
_CE_set16bpp:
 jp 7
_CE_setHLTo259:
 jp 11


libname("SAMPLELIB") <- Sets the name of the library
libvers(1) <- Sets the version of the library

This jump table simply provides ZDS something static to link against. This above code block will be placed at the start of the program. This prevents us from having to do countless relocations if a library's functions are used a lot. Not only that, but it prevents the C program from (technically) having a relocations table of its own.

Now, ZDS cannot simply use these functions, as they have not been prototyped. This header file fixes that:


Code:
// Include the assembly file directly into it
#pragma asm ".include \"SAMPLELIB.inc\""

/**************************************************
 Sets the LCD mode to 8bpp.
**************************************************/
void CE_set8bpp(void);

/**************************************************
 Sets the LCD mode to (Default) 16bpp.
**************************************************/
void CE_set16bpp(void);


Now we can use these functions properly. However, there needs to be a simple kernel linker on-calc in order to do this. The jump table holds the offsets of the functions from the start of the library (technically), and all that needs to be done is add the pointer of the library to the offsets. Boom, the C program is working properly, and has entry points. Yay.

However, there is still the issue of relocations within the library itself. If you take a look back at the first code block, you may see this:


Code:
function("CE_setHLTo259")
_setHLTo259:
relocate()
 call ldhl
relocate()
 jp ldhl

ldhl:
 ld hl,259
 ret


Silly, I know. But it gets the point across. The absolute address has to be resolved. So, the numreloc() macro that you may have seen earlier needs to be set to the number of these macros. (It generates an error if the number you supplied doesn't match the number it counted, and then tells you how many it counted). It creates a relocation table at the start of the library in which the location of the address to update is stored. So in this case, the relocation table would be 6 bytes long. We don't have to worry about much else, because pointers and all the rest work exactly the same way.

Now, we are almost there. What if LibraryA wants to use functions in LibraryB and vice-verca? Well, it can simply use exactly the same jump table file as ZDS uses! The on-calc loader will be able to handle it exactly the same, it just needs to make sure to only load the library once though, probably through setting a flag and updating a pointer.

The best part is that the macros and things have already been written, I just have to figure out some kinks in the loader. Right now I plan to do things in this order:


Code:
Library and Program both exist in Archive

Load C program to UserMem
Check if dependencies exist
Load dependencies after program in RAM
Update program entry points
Update library dependencies
Resolve inner library addresses.
Enjoy.


Ans thus I solved the problem that you all wanted to do. It works quite well, in my opinion. Could it use some fine-tuning? Sure. Things like only including functions that the program uses from the library sounds like an important issue. Technically though that is really easy to fix at some point too, as it just requires the removals of jump instructions from the include file. Please let me know what you guys think! Smile (And before you ask, yes, it easily supports multiple libraries in this way. Smile)

Edit: In this way, I can also make sure that the relocation table of the library doesn't have to be copied to RAM. This should help save a bit of space in some instances. Smile
I have to say that I'm not a fan of this approach. Indirecting all library calls through a jump table adds unnecessary time and space at runtime (notice how the jumptable must be replicated two times?). The time issue is particularly relevant on the ez80, because it is a pipelined architecture, and two consecutive jumps will stall the pipeline. We should strongly consider making use of an indirect jump and a little arithmetic anywhere we can't get away with a jr. On a more modern architecture the indirect jump might be problematic because it would trickier to predict branching well, but since the ez80 makes no attempt at branch prediction, I suspect it will result in fewer pipeline stalls than two consective jps (I'm equating call with jp here, since they use the same sort of control transfer).

Also, can we all agree up front that whatever format is agreed upon should be stored in appvars and not programs or apps?

[edit]
Also, ideally we want to do this without unarchiving, yes?
elfprince13 wrote:
(notice how the jumptable must be replicated two times?)

No, this is not the case. The jump table is not used by the library at all, it merely provides a way to statically link to ZDS.

elfprince13 wrote:
Also, ideally we want to do this without unarchiving, yes?

This has already been stated; libraries and programs will exist in the archive, and then be copied to RAM.

elfprince13 wrote:
Indirecting all library calls through a jump table adds unnecessary time and space at runtime

This is a lot better approach than constructing a relocation table for both the program and libraries. NOTE:Also, ZDS does not support outputting a relocation table. This is perhaps the most sane solution.

elfprince13 wrote:
The time issue is particularly relevant on the ez80, because it is a pipelined architecture, and two consecutive jumps will stall the pipeline. We should strongly consider making use of an indirect jump and a little arithmetic anywhere we can't get away with a jr. On a more modern architecture the indirect jump might be problematic because it would trickier to predict branching well, but since the ez80 makes no attempt at branch prediction, I suspect it will result in fewer pipeline stalls than two consective jps.

Quote:
When a control transfer takes place, the Program Counter (PC) does not progress sequentially.
Therefore, the pipeline must be flushed. All prefetched values are ignored. Control
transfer can occur because of an interrupt or during execution of a Jump (JP), CALL,
Return (RET), Restart (RST), or similar instruction. After the control transfer instruction
is executed, the pipeline must start over to fetch the next operand.

A control transfer is going to take place anyway. An absolute jump will not cause too much of a time difference, especially since arguments of functions have to be extracted from the stack anyway.

elfprince13 wrote:
Also, can we all agree up front that whatever format is agreed upon should be stored in appvars and not programs or apps?

This has also been established in previous posts.
MateoConLechuga wrote:
elfprince13 wrote:
(notice how the jumptable must be replicated two times?)

No, this is not the case. The jump table is not used by the library at all, it merely provides a way to statically link to ZDS.

Ah - then I'm misunderstanding how you're using the relocate macro.


MateoConLechuga wrote:
elfprince13 wrote:
Also, ideally we want to do this without unarchiving, yes?

This has already been stated; libraries and programs will exist in the archive, and then be copied to RAM.

This is what I'm saying not to do: if we're working on relocation, the obvious endgoal is to execute from archive, unless we're concerned with address space isolation.

Quote:
elfprince13 wrote:
The time issue is particularly relevant on the ez80, because it is a pipelined architecture, and two consecutive jumps will stall the pipeline. We should strongly consider making use of an indirect jump and a little arithmetic anywhere we can't get away with a jr. On a more modern architecture the indirect jump might be problematic because it would trickier to predict branching well, but since the ez80 makes no attempt at branch prediction, I suspect it will result in fewer pipeline stalls than two consective jps.

Quote:
When a control transfer takes place, the Program Counter (PC) does not progress sequentially.
Therefore, the pipeline must be flushed. All prefetched values are ignored. Control
transfer can occur because of an interrupt or during execution of a Jump (JP), CALL,
Return (RET), Restart (RST), or similar instruction. After the control transfer instruction
is executed, the pipeline must start over to fetch the next operand.

A control transfer is going to take place anyway. An absolute jump will not cause too much of a time difference, especially since arguments of functions have to be extracted from the stack anyway.

push pc/jp (hl) vs call IMM -> jp IMM is the difference of one pipeline flush or two pipeline flushes. This is a lot of stalling.
FWIW, by now, the TI-68k/AMS series has basically lived without execution from Flash for a couple decades. Several important routines leverage Self-Modifying Code, the most widely used one is probably the standard grayscale routine in GCC4TI and its dead ancestor.
Lionel Debroux wrote:
FWIW, by now, the TI-68k/AMS series has basically lived without execution from Flash for a couple decades. Several important routines leverage Self-Modifying Code, the most widely used one is probably the standard grayscale routine in GCC4TI and its dead ancestor.

In addition, execution from flash is impossible, because of locked writes. So, no relocation. In addition, it has been stated that due to flash wait states, it is a better idea to copy code to RAM for execution. The time difference between a pipeline flush and a lot of wait states is significant.

elfprince13 wrote:
Ah - then I'm misunderstanding how you're using the relocate macro.

Oh, sorry, I guess that is confusing Smile I'm not using the relocate macro at all; I made my own which parse the library and spit out the include for it.
Ah, sorry, still thinking in 84+ land where flash execution is desirable. I still recommend that we consider the indirect jump approach instead of the double-jump approach, but you could profile this to see how the pipelining actually plays out.
Note: This isn't really to crucial, but I thought I would let people know that ZDS is really annoying when it comes to linking together C and assembly.

For instance, it can't do this properly:

Code:
 db "SAMPLELIB",0
 db 1
_CE_set8bpp:
 jp 0
_CE_set16bpp:
 jp 7
_CE_setHLTo259:
 jp 11


So a workaround is this:

Code:
 db "SAMPLELIB",0
 db 1
_CE_set8bpp equ $
 jp 0
_CE_set16bpp equ $
 jp 7
_CE_setHLTo259 equ $
 jp 11
I was speaking with Runer today over IRC, and I believe a much better solution is as follows: rather than storing the absolute location of the function in the library include file, it is stored in a vector table inside of the lib. This requires a jump table (in the program, due to ZDS limitations), and a vector table (in the library). This will ensure heightened cross-compatibility across library versions.
Why do you need a jump table in the program, instead of using an indirect jump via the vector table?
elfprince13 wrote:
Why do you need a jump table in the program, instead of using an indirect jump via the vector table?

Because ZDS resolves all addresses as being absolute The jump table can be made sure to be located at an absolute address, while the vector table cannot be. The jump table is located at the start of the program(ish), and thus ZDS knows where to go for those calls. The jump table just contains offsets on the vector table, and then the vector table is added to the jump table which is then added to the location of the library in RAM. So, it is exactly like it was before, but now functions can be changed, moved around, and such without the need to update the version number, which should be changed when adding new functions or something.

NOTE: When a library is copied from the archive, the relocation table, header, and vector table remain in the archive. Only the code for the library itself is extracted.

Let me tell you, pulling this stuff off is utter madness. Currently you can load any assembly or C program from the RAM or archive, it will determine what it is, whether it needs to extract shared libraries, relocates the library's absolute addresses, and then runs the program. The only thing I really have left is to figure out how to link libraries together, which is no easy task, especially when the libraries are co-dependent.

For anyone interested, here is the format for libraries now:

Code:
#include "ti84pce.inc"
#include "RelocationMacros.inc"

 libname("SAMPLELIB")         ; Name of library
 libvers(1)            ; Version information
 
 numreloc(1)            ; Number of relocations
 numberoffunctions(3)         ; Number of functions
 
 function(CE_set8bpp,"CE_set8bpp")   ; This is the librarys vector table
 function(CE_set16bpp,"CE_set16bpp")
 function(CE_DispHL259,"CE_DispHL259")
 
 lib_begin()            ; Start of library code
 ;#include "lib"         ; Dependent libraries will go here
 
CE_set8bpp:
 ...
 ret
CE_set16bpp:
 ...
 ret
 
CE_DispHL259:
relocate()
 jp relocationTest

relocationTest:
 ...
 ret
 
 lib_end()            ; End of library code

Spiffy, eh? Smile
Quote:
Because ZDS resolves all addresses as being absolute The jump table can be made sure to be located at an absolute address, while the vector table cannot be. The jump table is located at the start of the program(ish), and thus ZDS knows where to go for those calls. The jump table just contains offsets on the vector table, and then the vector table is added to the jump table which is then added to the location of the library in RAM. So, it is exactly like it was before, but now functions can be changed, moved around, and such without the need to update the version number, which should be changed when adding new functions or something.

The whole point of making use of jp (hl) is that the address stored in HL doesn't need to be known at compile time...
elfprince13 wrote:
The whole point of making use of jp (hl) is that the address stored in HL doesn't need to be known at compile time...

What address stored in hl? ZDS makes it very clear what a function call is; it is literally a bunch of pushes to the stack for arguments, followed by the call itself, and then a whole bunch of pops to deallocate the stack.

EDIT: As for the shell, here is what functionality currently looks like (I still have to implement the pink flow section, but everything else is up and running. [RUN] in this case means the input program can be in RAM or Archive.

MateoConLechuga wrote:
elfprince13 wrote:
The whole point of making use of jp (hl) is that the address stored in HL doesn't need to be known at compile time...

What address stored in hl? ZDS makes it very clear what a function call is; it is literally a bunch of pushes to the stack for arguments, followed by the call itself, and then a whole bunch of pops to deallocate the stack.

Have you worked with dlopen / dlsym type dynamic linking in C before?
elfprince13 wrote:
Have you worked with dlopen / dlsym type dynamic linking in C before?

No, but I'm not quite sure what you are getting at. Something about Lazy and Now resolution? It needs to be Now resolution because of the limited space for metadata. The oncalc linker is written in assembly. Here is some more things:

1) ZDS can't output relocation tables in the final binary. Hence the need for the jump table.
2) Libraries are relocated to the end of the program.
I was trying to give you a touch-point for the way you could do this in assembly.


Code:

    ld hl,retloc
    push hl
    ld ix,(vector_table_entry)
    jp (ix)
retloc:
    ;stuff

vector_table_entry:
    .dw func_def

is equivalent to

Code:

    call jump_table_entry
    ; stuff
jump_table_entry:
    jp func_def


What I'm saying is...you should profile these against each other to see how the pipelining works out. Obviously the former is more verbose, but it also avoids a secondary pipeline flush, so it may be worthwhile.

Though I'm still not convinced we can't generate relocation tables and do this properly using

Code:
#pragma asm



[edit 2]
Wait a minute, according to the docs, ZDS explicitly supports emitting relocatable .obj files. Why not just implement our own linker from .obj to .8xv? We can use their compiler and assembler, and just skip the linker/locator step and defer location in our own linker implementation.
elfprince13 wrote:
I was trying to give you a touch-point for the way you could do this in assembly.


Code:

    ld hl,retloc
    push hl
    ld ix,(vector_table_entry)
    jp (ix)
retloc:
    ;stuff

vector_table_entry:
    .dw func_def

is equivalent to

Code:

    call jump_table_entry
    ; stuff
jump_table_entry:
    jp func_def


What I'm saying is...you should profile these against each other to see how the pipelining works out. Obviously the former is more verbose, but it also avoids a secondary pipeline flush, so it may be worthwhile.

Two problems with this:
1) We CANNOT change what happens when ZDS encounters a function call.
2) Because of (1), there is 2 control switches in both cases. The pipeline is therefore flushed twice on both of these, and the first one takes more bytes to do it, and thus is inherently slower.

elfprince13 wrote:
Though I'm still not convinced we can't generate relocation tables and do this properly using
Code:
#pragma asm

Could we manually generate our own? Sure. But that's meaningless when it comes to actual development.

elfprince13 wrote:
Wait a minute, according to the docs, ZDS explicitly supports emitting relocatable .obj files. Why not just implement our own linker from .obj to .8xv? We can use their compiler and assembler, and just skip the linker/locator step and defer location in our own linker implementation.

Sure, you could. This causes an increase in the size of programs, of which there is limited space for, and is simply not worth it in order to save a single jump instruction.
MateoConLechuga wrote:
Sure, you could. This causes an increase in the size of programs, of which there is limited space for, and is simply not worth it in order to save a single jump instruction.

Okay, this is the relevant thing now, since there's no reason to try to abuse the internals of ZDS with vector table schemes when we can do this.

(A) RAM (consumed by jump table) is more precious than Archive (consumed by relocation tables).
(B) This approach lets us selectively choose which functions from a library should actually be loaded into RAM, and which can be left alone.
(C) Optimizing function invocation is always worthwhile. Functions *will* get called in tight loops / recursions.
elfprince13 wrote:
(A) RAM (consumed by jump table) is more precious than Archive (consumed by relocation tables).

False. Size is only important when you remember programs can be at most 64kb. RAM/Archive doesn't matter. A relocations table will far surpass the amount of jumps, as we can specify which functions are needed. We don't have to include the entire jump table for a library. Say we use function foo() twice. It would need two entries in a relocation table, which would be 6 or more bytes depending on how much meta data you wanted. Or you could use function foo() 1000 times, and only need 4 bytes allocated with a jump table.

elfprince13 wrote:
(B) This approach lets us selectively choose which functions from a library should actually be loaded into RAM, and which can be left alone.

This is false. Functions cannot be loaded individually due to relative addressing.

elfprince13 wrote:
(C) Optimizing function invocation is always worthwhile. Functions *will* get called in tight loops / recursions.

I agree. However, as it stands, a single jump will not make any difference. If you want something fast, write it in assembly. The time required for stack manipulations (Of which there are a ton of) are orders of magnitude more important.
I posted that on TI-Planet and CodeWalrus a few days ago, but I think it's particularly fitting to post it in this topic, so here it is.

Lately, I've been working a bit on integrating Mateo's original techniques for getting C to compile for the CE devices. After cleaning [many] things up from ZDS (most notably the makefile), and integrating all kinds of nice things together, I have a working online C "IDE" Smile

Features include:
  • State saving when you build (via local storage mainly),
  • Proper C syntax highlighting
  • Code folding, Naive autocompletion, Auto indenting and other CodeMirror goodies.
  • Inlined warnings and errors from ZDS
  • Direct .8xp creation and download
Screenshots: (the one on the left is a bit outdated)


It will be publicly released when it's a bit more tested but I'll wait for Mateo to finish with his include files that'll be much nicer that what the IDE is currently using Razz

What's happening behind the scenes is: Page <=> AJAX <=> PHP+Bash workers/dispatchers <=> ZDS CLI tools (via the Makefile) run with wine + ConvHex.
Of course, the wine bottle is isolated from the rest of the actual Linux FS, and it's running from a restricted user account (no, you can't #include "/etc/passwd" Wink) Chrooting may take place if needed, that will be determined by further tests...

I believe most parts (i.e things not directly tied to TI-Planet's architecture) will become open-source since developer-oriented tools should always be released that way.

Via: https://tiplanet.org/forum/viewtopic.php?f=10&t=17279
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 4 of 5
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement