I have quite a few questions on how C compilers work in general (Targeting mostly on SDCC and GCC) that I think merits its own thread.

- How do linker files work? What exactly do they do? How do you specify them and set them up in GCC and/or SDCC?

- What is the point of crt0.s? I saw SDCC and GCC have one.

- How can you specify the beginning address from which code starts? Like .org $9D95 in general-assemblers for z80 assembly?

- How can you specify links to syscalls that an OS already has?

- A bit unrelated, but what is a .a static library file?

- What are the purposes of .o files in SDCC and/or GCC?

- (SDCC only) how can you specify a .C file to be compiled into intel hex format (.hex) so it can be used for something else in a z80 toolchain (like convert it to a certain program format)? I tried doing something like:


Code:
SDCC -c source.c -o source.hex -intel


It made literally ALL other files besides the .hex. And, how to I specify SDCC to compile to c99 standards?

TIA.
Ashbad wrote:
- What are the purposes of .o files in SDCC and/or GCC?
- How do linker files work? What exactly do they do? How do you specify them and set them up in GCC and/or SDCC?
- How can you specify the beginning address from which code starts? Like .org $9D95 in general-assemblers for z80 assembly?
- How can you specify links to syscalls that an OS already has?
- What is the point of crt0.s? I saw SDCC and GCC have one.

These are all closely related. Simply, the compiler itself generates an intermediate file (.o) which includes the generated code as well as a bunch of other useful information. The linker takes a linker 'file' (more usually, a 'linker script') and generates a file which can be directly executed on the target system. Using the Prizm GCC toolchain as an example, let's make a file nothing.c:

Code:
const int foo[] = {0, 1, 2, 3};

int main(int argc, char **argv) {
    volatile int i = foo[2];
    // Do-nothing loop
    for (i = 0; i < 4096; i++);
    return 0;
}


Start by compiling to an intermediate file (-c to compile only, no linking), and we'll have the compiler put it in nothing.o:

Code:
$ sh3eb-elf-gcc -c -mb -m4a-nofpu -o nothing.o nothing.c

We can examine the contents of the output file (which is ELF for this particular configuration of the compiler) with objdump (we'll use -hs to show the file headers and contents of sections):

Code:
$ sh3eb-elf-objdump -hs nothing.o
nothing.o:     file format elf32-sh

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000050  00000000  00000000  00000034  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000000  00000000  00000000  00000084  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  00000000  00000000  00000084  2**0
                  ALLOC
  3 .rodata       00000010  00000000  00000000  00000084  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .comment      00000012  00000000  00000000  00000094  2**0
                  CONTENTS, READONLY
Contents of section .text:
 0000 7ff461f3 71cc114e 61f371cc 115dd10f  ..a.q..Na.q..]..
 0010 521261f3 71cc112f 61f371cc e200112f  R.a.q../a.q..../
 0020 a0080009 61f371cc 511f6213 720161f3  ....a.q.Q.b.r.a.
 0030 71cc112f 61f371cc 521f9106 32178bf1  q../a.q.R...2...
 0040 e1006013 7f0c000b 00090fff 00000000  ..`.............
Contents of section .rodata:
 0000 00000000 00000001 00000002 00000003  ................
Contents of section .comment:
 0000 00474343 3a202847 4e552920 342e362e  .GCC: (GNU) 4.6.
 0010 3000                                 0.

We see five sections in this file. .text is the section that contains executable code, .data contains writable data which will be initialized at runtime, .rodata is like .data but not written to, .bss is scratch space which will be initialized to zero at runtime, and .comment is an informative section which tells anything that might read this ELF file that it was generated by GCC 4.6.0 (in this case, anyway. It might contain anything the compiler wants to include). The second line of each section's description shows what flags are set on that section. We can see that .text is flagged CODE and READONLY, for example, and both it and .data are flagged LOAD, meaning they'll be loaded straight into memory before execution begins.

The VMA and LMA columns in objdump's output indicate where in memory each section is expected to be loaded. Since we only had gcc compile this for us and didn't link, all the section addresses are at their defaults, 0. We can see what will change with linking by looking at the relocations:

Code:
$ sh3eb-elf-objdump -r nothing.o
nothing.o:     file format elf32-sh

RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE
0000004c R_SH_DIR32        _foo

This tells us that there's a value at .text+0x4c bytes which is a 32-bit integer used to refer to the address of the symbol _foo (that is, our const array). With that information, when we link and the absolute addresses of sections change, the linker can change the value at that location so the code points to the right places and runs correctly. Looking at the dump of .text above, we see that the current value of that is 0x00000000.

Now let's link this into a binary that we could run on the Prizm. Here's the linker script for reference:

Code:
OUTPUT_FORMAT(binary)
OUTPUT_ARCH(sh3)
 
/* Entry point.  Not really important here, since doing binary output */
ENTRY(initialize)
 
MEMORY
{
        /* Loads code at 300000, skips g3a header */
        rom (rx) : o = 0x00300000, l = 512k
        ram (rwx) : o = 0x08100004, l = 64k  /* pretty safe guess */
}
 
SECTIONS
{
        /* Code, in ROM */
        .text : {
                *(.pretext)     /* init stuff */
                *(.text)
                *(.text.*)
        } > rom
       
        /* Read-only data, in ROM */
        .rodata : {
                *(.rodata)
                *(.rodata.*)
        } > rom
       
        /* RW initialized data, VMA in RAM but LMA in ROM */
        .data : {
                _bdata = . ;
                *(.data)
                *(.data.*);
                _edata = . ;
        } >ram AT>rom
       
        /* Uninitialized data (fill with 0), in RAM */
        .bss : {
                _bbss = . ;
                *(.bss) *(COMMON);
                _ebss = . ;
        } >ram
}

Run the linker:

Code:
$ sh3eb-elf-ld -T prizm.ld -o nothing.bin nothing.o
bin\sh3eb-elf-ld.bfd.exe: warning: cannot find entry symbol initialize; defaulting to 0000000000300000

We should fix up that warning. Looking at the linker script, we see that the entry point (where execution begins) is set to the the symbol 'initialize' (line 5). We didn't see such a symbol in the ELF file the compiler gave us, though. That's where crt0 comes in. crt0 is the runtime initialization code. It takes care of setting up memory and the machine's state in general so the C code can take over with the machine in the state it wants. I won't waste more space by pasting the whole of our Prizm crt0 here, but it defines a .pretext section and a few other symbols, including initialize. When it finishes initializing things (copy the original contents of .data into RAM, zero out .bss..), it calls our C code's main function and handles cleaning up the environment when that function returns.

Looking back at the linker script, it says that the .text section in the output file should consist of the input .pretext and .text sections from the inputs, in that order. Due to how crt0 is set up, that puts initialize at the very beginning. The output .text and .rodata sections are to be placed in the memory region rom (> rom), defined towards the top of the file as a 512k chunk of read-only (and executable) memory at 0x00300000. .data gets its VMA in ram (>ram), but the LMA in rom (AT>rom), meaning the actual data is to be found in rom, but the actual code expects to find the contents of that in ram (so crt0 is expected to copy that into RAM beforehand).

If you were to build crt0 into an object file in the same way we did out .c file earlier and add it as an argument to ld, that will include all the code (nothing.c and the requisite crt0) in the output. I leave that as an exercise.

This post is getting pretty long, so I won't do this, but I highly recommend linking again but outputting another ELF file instead of the straight binary we usually output with prizm.ld (just change the first line to read OUTPUT_FORMAT(sh-elf)), then checking it out with objdump, allowing you to see how the section addresses have changed. You may also want to try objdump -d (d for disassemble) to have a look at the machine code it generated. Consider it another exercise.

When linking against OS libraries, you either have a known address they can be found at, or (more usually) the system provides a dynamic linker which handles all the relocations at runtime, after loading your shared libraries.

Ashbad wrote:
- A bit unrelated, but what is a .a static library file?
Just a file containing one or more packed object files which the compiler can pull out and embed in your object file as needed.

Ashbad wrote:
- (SDCC only) how can you specify a .C file to be compiled into intel hex format (.hex) so it can be used for something else in a z80 toolchain (like convert it to a certain program format)? I tried doing something like:


Code:
SDCC -c source.c -o source.hex -intel


It made literally ALL other files besides the .hex. And, how to I specify SDCC to compile to c99 standards?
I suggest you RTFM. If I had to hazard a guess, it doesn't work because sdcc doesn't support what you're trying to make it do, but I haven't actually looked at the manual.
Ashbad wrote:
- How do linker files work? What exactly do they do? How do you specify them and set them up in GCC and/or SDCC?


Magic.

Be more explicit, what linker files are you talking about? Intermediate objects (.o or .obj usually)? Shared libraries (.so or .dll)? Static libraries (.a)? Linker scripts?

Assuming you mean intermediate objects, for GCC you just give it a -c option. For very small projects it isn't needed (couple of files). The point of them is to break compiling up into chunks to avoid re-compiling everything when you change one file.

The format of them varies wildly and is compiler+linker specific. The high level overview is the linker hooks up all the references and such. Declare an "extern int foo" in a header, and the linker is the one that makes everyone who references foo refer to the same foo. When you used shared libraries (such as the c standard library), the linker is the one responsible for creating the appropriate load information so the OS knows what libraries are needed when the executable is run. The linker also is responsible for creating the executable - putting all the code in the right spot, data in the right spot, etc...

Quote:
- What is the point of crt0.s? I saw SDCC and GCC have one.


http://en.wikipedia.org/wiki/Crt0

Quote:
- How can you specify the beginning address from which code starts? Like .org $9D95 in general-assemblers for z80 assembly?


Depends on the platform, but usually you can't. Why would you want to? Also note that usually executables are relocatable - the same exe isn't necessarily loaded to the same address. The entry point is relative, not absolute.

For super in depth detail of one of the more common executable formats (ELF), see here: http://www.skyfree.org/linux/references/ELF_Format.pdf

Quote:
- How can you specify links to syscalls that an OS already has?


Depends on the platform and OS. On a PC you link against the shared library provided by the OS. If you are actually trying to create that shared library, you use assembly to create an interrupt, which is then handled by the kernel.

Quote:
- A bit unrelated, but what is a .a static library file?


Shared libraries are libraries that are needed to be present at runtime. Static libraries are libraries where the code in the library is actually inserted into the executable itself.

Quote:
- What are the purposes of .o files in SDCC and/or GCC?


See above where I talk about intermediate linker objects (in response to your first question).
Wow, that's really helpful, Tari! (and kllrnohj) I appreciate all of that -- I understand most of it for now, I'll try to make sense of the rest by later today. Smile
SDCC doesn't use .o files for z80 CPUs, it uses .rel files. This was done in the 3.0 release. '-c' means to make a .rel of the source, like
Code:
sdcc --mz80 --std-sdcc99 -c [input.c]
The output will be input.rel

Quote:
How can you specify the beginning address from which code starts? Like .org $9D95 in general-assemblers for z80 assembly?
SDCC uses the param '--code-loc [address in decimal format, like 16465]]'

Quote:
(SDCC only) how can you specify a .C file to be compiled into intel hex format (.hex) so it can be used for something else in a z80 toolchain (like convert it to a certain program format)?


Code:
sdcc --std-sdcc99 -mz80 --opt-code-size [--no-std-crt0] [--code-loc (deci.)] --data-loc 0 -o main.ihx main.c
--no-std-crt0 means that you must give the compiler your own crt0.rel, compiled like
sdasz80 -p -g -o crt0.rel crt0.s

If you want to look at GlassOS's build system, I can post some before the beta

Lastly,
Quote:
What is the point of crt0.s? I saw SDCC and GCC have one.

SDCC needs a crt0 in order to know where to place the code sections, done with .org statements. For TIOS, that .org $9D95 and .db statements would go in the crt0. (crt0 ~ asm header)
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 1
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement