- hg development thread
- 05 Oct 2025 04:36:34 pm
- Last edited by euphory on 07 Oct 2025 06:13:50 pm; edited 1 time in total
This thread will describe the development of the hg scripting language and interpreter for the CE.
Following on from here, I have now gotten most of the basic tasks of the interepreter working, with properly functioning functions coming up fast.
The memory is structured as follows:
The C API includes a function hg_Init(void *arena, size_t arena_size) that allocates the arena for the entire hg environment. This memory is then internally managed, and will not increase or decrease in size. So this is fed in with
Code:
and then at the end, this memory can be freed. Internally, it is managed in two structures: the heap, which is formatted as a singly-linked list starting at the start of the arena, and the stack, which grows downwards from the top of the arena. All objects (of type hg_expr_t) are contained in the heap, with the exception of the singletons hg_err and hg_nil, which are statically allocated (because otherwise they might be garbage collected, which would be a disaster). The stack contains bindings that are each simply two pointers: one to the name of the variable, and the other to its value.
The hg_expr_t is defined like so:
Code:
Tags are defined by bits 0-3. This is because on the CE, every writeable address that could really be part of the hg environment has a pointer that starts with $d0-$d6, and so therefore, if the pointer begins with something other than 1101 = $d, then we can simply restore it to a pointer by changing those first few bits; they store no necessary information. So a cons cell is any hg_expr_t where the car is a raw pointer; then we can just check if the tag starts with $d. Strings' tags start with $c, and their cars are also pointers: thus, string pointer derefences and storages are preceeded by flipping bit 3. But strings are stored in the heap as a series of string cells, each of which contain 3 chars in the cdr, and if none of those is a string-terminating null, the car will point to the next cell in the string.
Both of these solutions are inspired by the fe interpreter, but this platform-specific assumption in storing strings allows for a 50% increase in memory efficiency (and fe's assumption that all cells are aligned to four bytes, which it uses to unite the tags and pointers in the cars, is not really viable with six-byte-width CE cells, is not really viable here). So far, the binary is under 4 kB.
Next up: conditionals!
Git repo: https://tangled.org/@euphory.gay/hg
Following on from here, I have now gotten most of the basic tasks of the interepreter working, with properly functioning functions coming up fast.
The memory is structured as follows:
The C API includes a function hg_Init(void *arena, size_t arena_size) that allocates the arena for the entire hg environment. This memory is then internally managed, and will not increase or decrease in size. So this is fed in with
Code:
void *arena = malloc(ARENA_SIZE);
hg_Init(arena, ARENA_SIZE);and then at the end, this memory can be freed. Internally, it is managed in two structures: the heap, which is formatted as a singly-linked list starting at the start of the arena, and the stack, which grows downwards from the top of the arena. All objects (of type hg_expr_t) are contained in the heap, with the exception of the singletons hg_err and hg_nil, which are statically allocated (because otherwise they might be garbage collected, which would be a disaster). The stack contains bindings that are each simply two pointers: one to the name of the variable, and the other to its value.
The hg_expr_t is defined like so:
Code:
typedef struct hg_expr_t {
union {
uint24_t tag;
struct hg_expr_t *ptr;
} car;
union {
void *ptr;
struct hg_expr_t *(*fn)(struct hg_expr_t *, struct hg_expr_t *);
int24_t num;
char chars[3];
} cdr;
} hg_expr_t;Tags are defined by bits 0-3. This is because on the CE, every writeable address that could really be part of the hg environment has a pointer that starts with $d0-$d6, and so therefore, if the pointer begins with something other than 1101 = $d, then we can simply restore it to a pointer by changing those first few bits; they store no necessary information. So a cons cell is any hg_expr_t where the car is a raw pointer; then we can just check if the tag starts with $d. Strings' tags start with $c, and their cars are also pointers: thus, string pointer derefences and storages are preceeded by flipping bit 3. But strings are stored in the heap as a series of string cells, each of which contain 3 chars in the cdr, and if none of those is a string-terminating null, the car will point to the next cell in the string.
Both of these solutions are inspired by the fe interpreter, but this platform-specific assumption in storing strings allows for a 50% increase in memory efficiency (and fe's assumption that all cells are aligned to four bytes, which it uses to unite the tags and pointers in the cars, is not really viable with six-byte-width CE cells, is not really viable here). So far, the binary is under 4 kB.
Next up: conditionals!
Git repo: https://tangled.org/@euphory.gay/hg


