- Assembling eZ80 code with flat assembler g
- 13 Jul 2017 03:13:37 am
- Last edited by jacobly on 27 Sep 2021 02:52:11 pm; edited 3 times in total
After encountering numerous bugs in spasm and simply not being able to fix them all, I found out that the creator of flat assembler (fasm), an x86 assembler written in x86 assembly, started a new project called flat assembler g (fasmg), which is a generic assembler written in x86 assembly.
Many useful commands useful for writing assembly-like programs are builtin to fasmg, except for the target specific instructions and directives themselves. These can be implemented with macros, for example a nop macro might look like:
Code:
Which causes any line of code consisting of nop to get assembled to a single zero byte.
I have implemented ez80.inc which defines macros for all the ez80 instructions and some ez80 specific directives, tiformat.inc which allows outputting files in 8xp and similar formats, and ti84pceg.inc, a wrapper around spasm's ti84pce.inc. You can get the latest version of these files from GitHub, and you can get fasmg near the bottom of this page.
The main advantage of fasmg over spasm is a plethora of extra features, such as virtual output areas, loop commands, and forward-referenceable variables. The main disadvantage is that fasmg is an order of magnitude slower than spasm at parsing instructions due to each one having to go through an interpreted macro, for example my test involving all 4296 ez80 instructions takes .1 seconds to assemble with fasmg, but .01 seconds to assemble with spasm. Also, fasmg does not support listing out-of-the-box.
You can look at the hello world test to get started.
Some differences from spasm to keep in mind
In spasm preprocessor commands start with #, but fasmg does not have a preprocessor so the corresponding commands are not prefixed with #.
In spasm directives start with a dot, in fasmg directives omit the dot because identifiers starting with a dot are used for local labels, which are used like this:
Code:
This example creates a label which can be referenced with PutStrLen.loop anywhere, and .loop anywhere between PutStrLen: and the next non-local label definition.
Be careful reusing instruction/register names as other symbols in fasmg: things like add = 5 and ld: won't compile because they look like instructions, and a = 42will hide the symbol a so that you can't use that case variant to refer to the a register anymore.
While spasm will sometimes silently drop forward references, fasmg allows you to forward reference anything that is assigned exactly once (including labels, which can't be defined multiple times) and anything else you get the last assigned value, or an error if it hasn't been assigned the first time yet.
Fasmg does not support inline macros, instead you can call a macro that defines a result:
Code:
becomes
Code:
Fasmg does not parse backslash escapes in strings. Instead, define them outside a string using numbers or equates, and to include the quote delimiter character (" or ', whichever one you happened to use, they are interchangeable) inside a string, simply double it (e.g. db 9,"a'b""c",'d"e''f',13,10,0 produces the c string "\ta\'b"cd"e\'f\r\n").
In fasmg, a line that ends with \ continues onto the next line. Unlike spasm, \ cannot be used to put multiple lines into one line (though, as usual, you could always just write a macro to support that if you wanted it).
Spasm strictly evaluates things from left-to-right with signed 32-bit integer arithmetic, but fasmg has a slightly unusual order of operations on linear polynomials with infinite precision 2-adic integer coefficients.
Porting spasm commands to fasmg
Code:
Extra features in ez80.inc
jq works just like jr/jp except it outputs nothing if you are trying to jump to the address following the instruction, a jr if possible, otherwise a jp. Note that because fasmg is a multi pass assembler, it should generally find the optimal solution to this (e.g. if converting a jp to a jr brings another jp into jr range). Please don't try to use pc relative targets (e.g. jr $+3) around jq as you may not actually know what size the jq will end up being!
A contiguous sequence of push/pop instructions can be coalesced to a single instruction with each register a separate argument, for example:
Code:
is the same as
Code:
Instructions that take a bit index as an argument accept numbers outside the range 0-7 in certain cases. For example, bit 8,hl is the same as bit 0,h and bit -1,(ix-10) is the same as bit 7,(ix-11). Note that things that don't correspond to a single instruction like bit 12,(hl) and bit 16,hl won't assemble.
ZDS pseudo instructions are currently supported, but may be moved to a separate file in the future.
FASMG Quick Reference
Code:
Please ask any questions in this thread about things that don't work as you expect, or about how best to do some specific idiom, or port some snippet of spasm code, since many of fasmg's most powerful features are not very obvious, and certainly very different from other assemblers. I will update this post with things as I think of them or as they come up.
Many useful commands useful for writing assembly-like programs are builtin to fasmg, except for the target specific instructions and directives themselves. These can be implemented with macros, for example a nop macro might look like:
Code:
macro nop
db 0
end macro
Which causes any line of code consisting of nop to get assembled to a single zero byte.
I have implemented ez80.inc which defines macros for all the ez80 instructions and some ez80 specific directives, tiformat.inc which allows outputting files in 8xp and similar formats, and ti84pceg.inc, a wrapper around spasm's ti84pce.inc. You can get the latest version of these files from GitHub, and you can get fasmg near the bottom of this page.
The main advantage of fasmg over spasm is a plethora of extra features, such as virtual output areas, loop commands, and forward-referenceable variables. The main disadvantage is that fasmg is an order of magnitude slower than spasm at parsing instructions due to each one having to go through an interpreted macro, for example my test involving all 4296 ez80 instructions takes .1 seconds to assemble with fasmg, but .01 seconds to assemble with spasm. Also, fasmg does not support listing out-of-the-box.
You can look at the hello world test to get started.
Some differences from spasm to keep in mind
In spasm preprocessor commands start with #, but fasmg does not have a preprocessor so the corresponding commands are not prefixed with #.
In spasm directives start with a dot, in fasmg directives omit the dot because identifiers starting with a dot are used for local labels, which are used like this:
Code:
PutStrLen:
ld b,(hl)
.loop:
inc hl
ld a,(hl)
call _PutC
djnz .loop
ret
Be careful reusing instruction/register names as other symbols in fasmg: things like add = 5 and ld: won't compile because they look like instructions, and a = 42will hide the symbol a so that you can't use that case variant to refer to the a register anymore.
While spasm will sometimes silently drop forward references, fasmg allows you to forward reference anything that is assigned exactly once (including labels, which can't be defined multiple times) and anything else you get the last assigned value, or an error if it hasn't been assigned the first time yet.
Fasmg does not support inline macros, instead you can call a macro that defines a result:
Code:
#define sum(a,b) a+b
ld a,sum(1,2)
Code:
macro sum r,a,b
r = a + b
end macro
sum temp,1,2
ld a,temp
Fasmg does not parse backslash escapes in strings. Instead, define them outside a string using numbers or equates, and to include the quote delimiter character (" or ', whichever one you happened to use, they are interchangeable) inside a string, simply double it (e.g. db 9,"a'b""c",'d"e''f',13,10,0 produces the c string "\ta\'b"cd"e\'f\r\n").
In fasmg, a line that ends with \ continues onto the next line. Unlike spasm, \ cannot be used to put multiple lines into one line (though, as usual, you could always just write a macro to support that if you wanted it).
Spasm strictly evaluates things from left-to-right with signed 32-bit integer arithmetic, but fasmg has a slightly unusual order of operations on linear polynomials with infinite precision 2-adic integer coefficients.
Porting spasm commands to fasmg
Code:
#include "file"
include "file"
#import "file"
file "file"
#comment
#end comment
if 0
end if
#define x y
#defcont z
define x y \
z
#if x
#elif y
#else
#endif
if x
else if y
else
end if
#ifdef x
#endif
if defined x
; note that this will still run even if x is defined (once) later, so do not use for include guard
end if
#ifndef
#endif
if ~defined x
; note that this will not run even if x is defined (once) later, so do not use for include guard
end if
#undef x
#undefine x
while defined x
restore x
end while
#macro x(y,z)
#endmacro
macro x y,z
end macro
.addinstr ...
; You have to write a macro that parses the arguments and emits the correct code.
.block x
.fill x
rb x
.byte x
.db x
db x
.word
.dw x
dw xx
.long
.dl x
dl x
.echo "str"
display "str",10
.echo 123
repeat 1,x:0
display `x,10
end repeat
.end
; Ignored in spasm anyway
.error "string ",5
repeat 1,x:5
err 'string',`x
end repeat
.fill x,y
db x dup y
.list
.nolist
; fasmg does not support listing ootb, but surprisingly could be done with macros.
.org userMem-2
.db tExtTok,tAsm84CeCmp
; This is done automatically in tiformat.inc for executable files.
.org x
org x
.seek x
.db y
store y : byte at x
x .equ y
x = y
Extra features in ez80.inc
jq works just like jr/jp except it outputs nothing if you are trying to jump to the address following the instruction, a jr if possible, otherwise a jp. Note that because fasmg is a multi pass assembler, it should generally find the optimal solution to this (e.g. if converting a jp to a jr brings another jp into jr range). Please don't try to use pc relative targets (e.g. jr $+3) around jq as you may not actually know what size the jq will end up being!
A contiguous sequence of push/pop instructions can be coalesced to a single instruction with each register a separate argument, for example:
Code:
push af, bc, de, hl
call routine
pop hl, de, bc, af
ret
Code:
push af
push bc
push de
push hl
call routine
pop hl
pop de
pop bc
pop hl
Instructions that take a bit index as an argument accept numbers outside the range 0-7 in certain cases. For example, bit 8,hl is the same as bit 0,h and bit -1,(ix-10) is the same as bit 7,(ix-11). Note that things that don't correspond to a single instruction like bit 12,(hl) and bit 16,hl won't assemble.
ZDS pseudo instructions are currently supported, but may be moved to a separate file in the future.
FASMG Quick Reference
Code:
Each line is a precedence level, except where specified otherwise, operators within a precedence level are evaluated ltr.
Logical expressions:
~ unary logical not, arguments can also be
& | binary logical conjunction and disjunction
All the operators below this line can only have basic expressions as arguments
= < > <= >= <> eq eqtype relativeto comparison, <> is not equal, eq is like = but returns false for uncomparable operands, eqtype returns true if both operands are the same (algebraic, string, float) type, relativeto returns true if both operands are comparable (differ by a constant)
defined used true if the basic expression to the right is entirely defined, true if the variable name to the right is used
Basic expressions:
lengthof elementsof float trunc length of a string in characters, length of a poly in terms, int to float, float to int
sizeof elementof scaleof metadataof rtl size associated with a label, poly op idx
element scale metadata idx op poly
not bsf bsr complement, index of lowest set bit, index of highest set bit
shl shr bswap shifts, byte swap (second arg is the size of the value to swap, in bytes)
xor and or bitwise
mod remainder
* /
+ - includes unary!
string converts a number to a string (strings are usually converted to numbers implicitly, or explicitly with a unary + operator)
Builtin symbolic variables:
<name>#<name> ; concatenates two names into a single identifier, but each side may get replaced individually if it matches a parameter name
$ ; current pc
$$ ; base of the current output section (address passed to last org)
$@ ; address after last non-reserved data
$% ; offset within output file, does not work with tiformat.inc
$%% ; offset after last non-reserved data within output file, does not work with tiformat.inc
%t ; assembly timestamp
__file__ ; current file
__line__ ; current line
Commands:
org <basic expr> ; start a new output area to appear next in the file and assembled starting at address <basic expr>
section <basic expr> ; same as org but do not output reserved bytes, does not work with tiformat.inc
virtual [<basic expr>] ; start a new output area that does not get output to the file
end virtual ; restore the previous output area
<name>:: ; creates an area label that references the current output area
load <name>[ : <size>] from [<area label> : ]<addr> ; loads <size> (defaults to sizeof <addr>) outputted bytes from address <addr> in output area <area label> (defaults to current output area) and store in variable <name>
store <name>[ : <size>] at [<area label> : ]<addr> ; stores <size> (defaults to sizeof <addr>) bytes to address <addr> in output area <area label> (default to current output area) and store in variable <name>
db <basic expr 1>[, <basic expr 2>...] ; define 1-byte values
rb <basic expr> ; reserve <basic expr> 1-byte locations
dw <basic expr 1>[, <basic expr 2>...] ; define 2-byte values
rw <basic expr> ; reserve <basic expr> 2-byte locations
dl <basic expr 1>[, <basic expr 2>...] ; define 3-byte values
rl <basic expr> ; reserve <basic expr> 3-byte locations
dd <basic expr 1>[, <basic expr 2>...] ; define 4-byte values
rd <basic expr> ; reserve <basic expr> 4-byte locations
emit|dbx <size> : <basic expr 1>[, <basic expr 2>...] ; define <size>-byte values
; If any reserve or definition is preceded by a name with no colon, that name gets defined as a label to the first item, with a sizeof the item size
; Inside any definition:
<basic expr 1> rep <basic expr 2> ; repeats <basic expr 2> <basic expr 1> times, to include more than one value in the repetition, enclose them with <>
<name> equ <anything> ; define an arbitrary text substitution for a symbol
<name> reequ <anything> ; same as equ but don't discard previous value
define <name> <anything> ; same as equ, but <anything> is not checked for recursive substitutions until use
redefine <name> <anything> ; same as define but don't discard previous value
<name> = <basic expr> ; assign a value to a symbol, discarding current value
<name> =: <basic expr> ; assign a value to a symbol, remembering the current value
<name> := <basic expr> ; assign a value to a constant symbol only once, attempts to redefine will error, therefore it can always be forward referenced
<name>: ; like <name> := $
label <name>[ : <size>][ at <addr>] ; defines a constant symbol with <size> size at address <addr> (defaults to $)
restore <name 1>[, <name 2>...] ; restore the previously remembered value for each symbol
; namespaces can be created by assigning anything to any symbol
namespace <name 1> ; switches to, but does not create, the namespace associated with the symbol <name>
; all new symbols <name 2> defined in here can be referenced outside the namespace block with <name 1>.<name 2>
end namespace
macro <name> [<param 1>[, <param 2>...]] ; defines a macro, remembering the current contents
; macro body, parameters get substituted with their values every time the macro is executed
end macro
purge <name> ; restores the previously remembered contents of a macro
struc <name> [(<label name>) ][<param 1>[, <param 2>...]] ; defines a labeled macro, remembering the current contents
; macro body, parameters get substituted with their values every time the macro is executed, Both . and <label name> refer to the label
end struc
restruc <name> ; restores the previously remembered contents of a labeled macro
; macro and struc args can be suffixed with * to mean required, : <basic expr> to give a default value if not specified, and & on the last argument means it takes on the value of all the remaining arguments
local <name 1>[, <name 2>...] ; makes the specified symbols local to the current macro or struc invocation
esc macro ... ; exactly like macro, but does not require an extra end macro to end the enclosing macro
esc end macro ; exactly like end macro, but does not close the enclosing macro even if there was no opening macro
[else ]if <cond expr>
; run these commands if <cond expr> is true, with else only if the previous command was false or didn't match
[else ]match <pattern>, <anything>
; run these command if <pattern> matches <anything>, with else only if the previous command was false or didn't match
[else ]rmatch <pattern>, <anything>
; run these command if <pattern> matches <anything>, discarding context, with else only if the previous if was false or match didn't match
else
; run these commands if previous if was false or match didn't match
end if|match|rmatch ; ends if, match, or rmatch, use whichever was used last (before the optional else)
while <cond expr>
; run these commands while <cond expr> is true
end while
repeat <basic expr>[, <name 1> : <basic expr 1>[, <name 2> : <basic expr 2>...]]
; run these commands <basic expr> times, symbols start at their initial value and go up by one each iteration
end while
iterate|irp <name>[, <first>[, <second>...]]
; run these commands
end iterate|irp
irpv <name 1>, <name 2>
; run these commands with name 1 equal to each remembered value of name 2, starting from the oldest, only works with define/equ
end irpv
; Inside all looping commands:
% ; current iteration starting from 1
%% ; total iterations
indx <basic expr> ; switches to a different iteration index
break ; break out of loop
include '<file>' ; assembles <file> at the current location in the source
file '<file>'[ : <start>[, <size>]] ; outputs <size> (defaults to entire file) bytes from a file starting at byte <start> (defaults to beginning)
display <basic expr 1>[, <basic expr 2>...] ; outputs strings as strings and numbers as characters to stdout
eval <basic expr 1>[, <basic expr 2>...] ; same syntax as display, evaluates the concatenation of all the arguments as assembly code at the current location in the source
err <basic expr 1>[, <basic expr 2>...] ; same syntax as display, displays a custom error the causes assembly to fail if this is the last pass
assert <cond expr> ; causes assembly to fail if <cond expr> is false on the last pass
Please ask any questions in this thread about things that don't work as you expect, or about how best to do some specific idiom, or port some snippet of spasm code, since many of fasmg's most powerful features are not very obvious, and certainly very different from other assemblers. I will update this post with things as I think of them or as they come up.