Login [Register]
Don't have an account? Register now to chat, post, use our tools, and much more.
Introduction:

To get started in SH3 programming, a background in z80 or other low level programming is helpful, but it's not necessary. Also, the information contained herein is accurate to the best of my knowledge at the time of this posting, but may not be in the future. Be sure to double check other documentation if something doesn't seem to work.

Also, for future reference, the [current] standard documentation can be found at these links:

SH3 Hex equates
SH3 Assembly language documentation
7705 SH3 processor documentation

In order to understand SH3 hex, you need to understand the basic facts about the processor itself. The SuperH 3 processor is what's known as a 32 bit RISC processor. This means that the processor likes to operate on 32 bit chunks of data. This means that each bit (which is a binary 1 or 0 in memory) is grouped with 7 other bits to form a group of 8 bits known as a "byte." Two bytes are grouped to form something called a "Word," which has 16 bits. Two words (4 bytes or 32 bits) are grouped together into a large group known as a "Longword" in SH3 Assembly. RISC stands for Reduced Instruction Set Computing and refers to the design philosophy of the processor. Instead of providing a lot of complex and slow instructions, the designers of the chip decided to go for a smaller instruction set which could be processed faster and increase the number of instructions per kilobyte of code. Unlike many other 32 bit chips which have 32 bit instructions, each SH3 command is specified by a string of 16 bits. This has the side effect of making each instruction very fast (on the order of 2-4 times faster that similar instructions on other chips at the same clock frequency). One example of this instruction reduction is that the instruction to move a byte from the location four bytes farther in the code into the R1 register would be 1110000100000100. If this seems a bit impractical to program in, it is. No one wants to type in 16 1's and 0's every time they do anything. If one did, even a small program would look like this:
Quote:

11011000 00000110 11011001 00000110 11011010 00000111 10011011 00010000 10011100 00010000 10011101 00001101 00101010 11000001 00101001 10110001 00101000 11010001 00000000 00001001 00000000 00001001 00000000 00001011 00000000 00001001 11111111 11111111 11111111 10000000 11111111 11111111 11111111 10000100 11111111 11111111 11111111 10000110 00000000 00001001 00000011 00000000 01011010 01111010 10100101 01100101


If you were wondering, that particular code overclocks the processor. But instead of binary, something called hexadecimal is used. As it turns out, every sequence of four bits can be specified by a single hexadecimal letter.



When you convert the previous binary to hex, it becomes much more manageable:

Quote:
D806 D906 DA07 9B10 9C10 9D0D 2AC1 29B1 28D1 0009 0009 000B 0009 FFFF FF80 FFFF FF84 FFFF FF86 0009 0300 5A7A A565


Which would you rather use?

In any case, hex editors, such as my personal recommendation of HxD are available and Binary editors are not.

Each instruction operates based on what are known as registers. There are 16 general purpose registers in the processor, labeled R0 through R15. The first 8, R0 through R7, are switched depending on which privilege mode the processor is operating in at the moment. The last 8, R8 through R15, are the same no matter what mode the processor is in. Of these registers, only two have any uses other than to hold data, R0 and R15. R0 is automatically accessed by some instructions and R15 is used as a stack pointer during exception handling. If there are interrupts present or the processor throws code errors, then R15 will probably be corrupted. There are also two other types of registers, called system registers and control registers. These have their uses, but they can have specific methods for accessing and writing to them may cause unintended effects if not controlled. We'll deal with specific registers as we come to them, but understand that any register that is not in the range R0 through R15 should not be used for holding data.

Now, to program in SH3 at the current period of time, you need to modify an existing add-in program, three of which are available here. I recommend the Conversion add-in because it is the easiest to modify. Remember to fill all of the code (except for the last four bytes) after address 7000 with repeating sequences of 00 09. If you use HxD, you can write the code in another document and copy/paste it into the add-in. After that, you must re-compute the security checksums, but that is beyond the scope of this article and will soon be rendered unnecessary.

Programming:

Since all of the basics are out of the way, we can begin with the programming.

The first thing almost every program needs is for the program to be set to an initial state. This is known as data loading and is very easy on most processors. Unfortunately, it can be royal pain on the SH3 because of its RISC design. When you need to load a byte into a register Rn, you'd write Enii, where n is the number of the register in hex and ii is the byte of immediate data. For example, EA04 would load register R10 with 0x04h. But, wait, R10 is a 32 bit register, right? It can't hold only 8 bits. Well, that's right, but the processor does something known as sign extension where the byte is extended to 31 bits and placed in the register. The highest bit in the register, bit 32 is used to hold the sign of the number. In this case, it would be positive. If the byte were FF, then the register would hold the two's complement of the number, which is a negative number.

If you want to load a larger chunk of data, the method is more complicated. For loading words, the instruction is 9ndd. For longwords, the instruction is Dndd. The dd byte refers to a displacement, which is multiplied by 2 for words and by 4 for longwords. The product is then added to the program counter and the word or longword starting at that location is placed in Rn. If the instruction is loading a word, then it will sign extend the word as with loading bytes. Here's some example code:


Code:

---------------------------
| Address  | Instruction  |
| 0000     | DF01         |
| 0002     | 0009         |
| 0004     | 0009         |
| 0006     | FFFF         |
| 0008     | FF80         |
---------------------------


This will take the longword FFFF FF80 starting at address 0006 and place it in R15.

This will just allow you to load a program from nearby constant locations within the program. It's also important to be able to important data from other places in the calculator. This handled by another set of instructions, 6nm0, 6nm1, and 6nm2. These instructions will move a byte, word, and longword from the address contained in Rm to Rn. Rm is called a pointer because it "points" to the data. In reality, if you were use this for its obvious use of importing sets of continuous data, it'd be less efficient that another routine, but we'll get that more efficient and obfuscated method to that later.

So, let's write a program to import some data about the processor from the FRQCR register, which controls the processor clocks.


Code:

0009     // This just increments the program counter without doing anything.
D803     // Load the address of FRQCR (FFFF FF80)
6980     // Load R9 with the first byte of FRQCR
7801     // Increment the pointer R8
6A80     // Load R10 with the second byte of FRQCR
000B     // Return from the program. Always include this and always have 00 09 after it.
0009
0009
FFFF
FF80


If you couldn't tell, this program looks up the address of the FRQCR register (which is a 16 bit control register, hence the two byte imports) and then uses that address as a pointer in the other commands. Now you could use move a word into R9 and be done faster, but that's beside the point. As you can see, importing data is a pain, but it's possible.

More later, particularly on arithmetic and branch commands...
So, based on that excellent introduction, a few questions:
1) What is 0009? Why is it after the DF01 and before the FFFF \ FF80?
2) 7nDD is therefore addition of DD to register n? Is there no explicit inc/dec?
3) Whoops, I see that 0009 is just a nop. Why the nops in that first chunk of code then, as per (1)?

This reminds me a lot of the (RISC processor) MIPS that I learned to program in my Computer Architecture class.
There are a lot of grammatical errors in there... <.<

Anyway,

1) The NOPs are used as spacing. The Dndd command jumps to the location @($dd*4+PC) and stores the longword found there to Rn. I this case, $dd=1, so the CPU jumps 1*4 bytes ahead in the program and recalls the data from there. Thus, it skips 00 09 00 09 and goes straight to FFFF FF80.

2) Well, there are four or five different ways to increment the pointer, the most common of which would be 6nm4, 6nm5, and 6nm6. However, they're kind of odd in that how much the pointer is incremented depends on what kind of data you're moving.
Fascinating, thanks for sharing. I look forward to your further segments of this! Do you have any other larger chunks of code? Does SH3 not have mnemonics for the opcodes and operands?
That example I posted in binary is the largest piece of working code I've written so far, though I do have larger routines that I'm currently working on. Mnemonics do exist though, as can be seen from the SH3 Hex equates link I posted under the standard documentation. The problem is that I've found no [working] SH3 assembler, so mnemonics are pretty much useless except as pseudo-code.
Qwerty.55 wrote:
That example I posted in binary is the largest piece of working code I've written so far, though I do have larger routines that I'm currently working on. Mnemonics do exist though, as can be seen from the SH3 Hex equates link I posted under the standard documentation. The problem is that I've found no [working] SH3 assembler, so mnemonics are pretty much useless except as pseudo-code.
Well, then an SH3 assembler is in order! Weren't you working on such a project recently, or was that a disassembler?
KermMartian wrote:
Does SH3 not have mnemonics for the opcodes and operands?

Your comment led me to actually look into the architecture, and I was pleasantly surprised to see it's a Renesas creation- I've gotten plenty of experience working with the RX core in the last week or so, and the SH3 looks like the core they place between R8C and RX, in that it's 16-bit where the R8C is 8-bit and RX is 32.

As all the Renesas cores are somewhat similar in feature set as far as I can tell (RX, for example, adds a FPU and 64-bit DSP accumulator as standard), I'm not terribly suprised to find that gcc supports SH3 as a target, and I assume (not having verified, but my memory wants to say so) that IAR's compiler also supports it and is available in a limited free version.

Qwerty.55 wrote:
The problem is that I've found no [working] SH3 assembler, so mnemonics are pretty much useless except as pseudo-code.

A quick look at the Renesas web site shows that HEW supports SH3, and I've now gone and verified that IAR also supports SH3, which formerly went by the name R16C, it seems. Neither of those seem to have particularly good free versions, but if gcc supports it, a properly configured toolchain will include a working assembler (gas) as well as the compiler, linker, etc.


Aside: why is it that it seems everyone who's a bit newer here treats assembly as only hex (or should I say, 'HEX') and ignores any higher level? I understand it a bit, but working in machine code is, IMHO, a gigantic waste of time.
Confused
The Tari wrote:
Aside: why is it that it seems everyone who's a bit newer here treats assembly as only hex (or should I say, 'HEX') and ignores any higher level? I understand it a bit, but working in machine code is, IMHO, a gigantic waste of time.
Confused
I cannot agree with you more strongly. I've been railing against this since Scout, Xeda, and quite a few other people seem convinced that they must deal in hex (HEX? Hex? hEx?) opcode equivalents instead of using the nice easy semi-English mneumonics. Superb call on those tools, Tari; I hope the Prizm hackers will give them a look-see.
Actually, now that you say "semi-English", I guess it somewhat starts to make sense. If you notice, Scout and a few others among the hEx group tend to not have English as their first language. So maybe it's just easier for them to use something they know (letters and numbers) rather than some shortened down English mnemonics.
_player1537 wrote:
Actually, now that you say "semi-English", I guess it somewhat starts to make sense. If you notice, Scout and a few others among the hEx group tend to not have English as their first language. So maybe it's just easier for them to use something they know (letters and numbers) rather than some shortened down English mnemonics.
I see your point, and it's certainly a good one, but if that's the case, it's trivial to write a new tasm80.tab (or equivalent file) that instead uses mnemonics more understandable in one's native language.
But, at that, no one has done that yet. And I honestly didn't even know/remember that tasm80.tab held the mnemonics :/ I'd be curious if releasing such a .tab file would cause some to switch from writing hex opcodes to using mnemonics.
[quote="The Tari"]
KermMartian wrote:

A quick look at the Renesas web site shows that HEW supports SH3, and I've now gone and verified that IAR also supports SH3, which formerly went by the name R16C, it seems. Neither of those seem to have particularly good free versions, but if gcc supports it, a properly configured toolchain will include a working assembler (gas) as well as the compiler, linker, etc.


Aside: why is it that it seems everyone who's a bit newer here treats assembly as only hex (or should I say, 'HEX') and ignores any higher level? I understand it a bit, but working in machine code is, IMHO, a gigantic waste of time.
Confused


I actually have HEW on my system, along with another SH3 compiler Razz

Unfortunately, I can't figure out how to use them...

In any case, having to compile the code in inline assembly, disassemble my own code and only then get to the part at which I can actually insert it into a modified add-in seems like a waste of time to me. I'm comfortable with Hex. Whether English or Hex, they're both foreign languages to my thought processes (although English is indeed my first language). Basically, Hex makes up for my lack of computer competency. If you prefer Assembly mnemonics, that's fine, but I'm personally more productive if I don't have to fight with a computer about syntax errors. Xeda has her own reasons for preferring hex. She can work with numbers better than symbolic words. Either way, it's about picking the language that suits the programmer.

By the way, judging by the speed I've seen out of Casio's stuff, I'm not inclined to use a C compiler for much of my work.
Qwerty.55 wrote:
In any case, having to compile the code in inline assembly, disassemble my own code and only then get to the part at which I can actually insert it into a modified add-in seems like a waste of time to me.

Fair enough, but I still feel like it ends up being a huge waste of time. I'm not very familiar with the Renesas toolchain packaged with HEW, but I do know my way around gcc. Here's a bit of source and makefile associated with one of my RX projects, which would probably translate with little work to target SH3 (this is actually the bootstrap for the C environment, but exactly what it does isn't important):

Code:
[=== reset.asm ===]
_PowerON_Reset:
    mvtc #_ustack, USP /* User stack pointer */
    mvtc #_istack, ISP /* Interrupt stack pointer */
    mvtc #_rvectors_start, intb /* Interrupt vectors */
    mvtc #100h, fpsw
/* Copy data segment into RAM */
    mov #_mdata, r2
    mov #_data, r1
    mov #_edata, r3
    sub r1, r3
    smovf
...

[=== makefile ===]
reset.o: reset.asm
   rx-elf-gcc -Wa -nostdinc -g2 -g -mlittle-endian-data -gdwarf2 -c -x assembler-with-cpp -o reset.o

From there, it's probably just a matter of the correct linker script and maybe a single custom program to pull out your code and stuff it into whatever container the calculator expects it to be in.
I'd volunteer to help with setting that up, but I don't know anything about what the Prizm expects- if you can point me at some documentation, I could probably figure it out.

Qwerty.55 wrote:
By the way, judging by the speed I've seen out of Casio's stuff, I'm not inclined to use a C compiler for much of my work.

What makes you say that? I know from experience that compilers targeting Renesas chips do a pretty good job of generating fast code, and about the only optimization I've spotted while running through disassemblies involve violating the ABI in order to avoid dumping parameters to stack. Compared to, say, PIC14 or PIC16, the SH3 is extremely friendly towards the usual C implementation methods.
I think there's a lot going on under the hood that you don't know about, because I very much doubt that Casio's programmers or whatever toolchain they used is/are really that incompetent.
Let's put it this way: 28 KB header for add-ins, about a third of which is blank space. Thousands upon thousands of bytes that do absolutely nothing in their flipbook files. A 58 MHz processor that runs equivalent BASIC programs as fast as a 15 MHz z80. A mouse cursor that moves slower than any other mouse cursor I've ever seen. Sure, they're great at math, but that's about it from what I can tell.

Of course, the calculator might be computing solutions to the Riemann Hypothesis and I just don't realize it, but I very much doubt it. There is without a doubt a lot going on under the hood of the Prizm. I understand that from the virtual memory disassemblies, which reveal just how many system calls are being made by every program. But the processor is more than fast enough to handle any reasonable routine given the output of the device.
Fair enough.
Despite whatever terrible performance Casio's software has, though, your code shouldn't be that bad with any commonly available toolchain.
Just as an example of a simple optimization, many of the times Casio's code branches, the branch is followed in the pipeline by a NOP, not the compatible instruction immediately preceding it. That would be an incredibly simple optimization, yet it's often not implemented. Just an aside.
Qwerty.55 wrote:
Just as an example of a simple optimization, many of the times Casio's code branches, the branch is followed in the pipeline by a NOP, not the compatible instruction immediately preceding it. That would be an incredibly simple optimization, yet it's often not implemented. Just an aside.
Maybe their C compiler doesn't support optimizations like that?
Possibly, but considering all the insane assembly optimization tips that good modern assemblers perform, I'd think that unlikely.
KermMartian wrote:
Possibly, but considering all the insane assembly optimization tips that good modern assemblers perform, I'd think that unlikely.
That assumes they are using a good modern assembler, look at TI there are tons of space and speed optimizations they could have done but didn't so I doubt casio would be as careful either.
Actually, as far as localized optimizations, the TI-OS isn't that badly written. My issue with it (I'm sure BrandonW would have a more precise or possibly different viewpoint) is that I think from a 5,280-foot view, some of their design decisions are real head-scratchers, and seem unnecessarily convoluted and complex. Of course, there are the depressingly-hilarious BCalls added in the later versions that Brandon uncovered, but that's a different matter.
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 3
» All times are GMT - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement