Introduction
For the last year, I have been working on Claw, a development system for embedded devices.
It has a very small RAM footprint and can run on systems with as few as 2 KB of RAM.
The VM's own memory footprint is tiny and most of the RAM remains to be used by the applications.
Claw will probably support multitasking in it's operating system releases and also support CEFS (Claw embedded read-only file system) and FAT32 for loading (and storing) files from and to.
The files are streamed and therefore can be as large as 4 GB (on 32 bit VM's) or 64 KB (per executable) on 16 bit systems. 64 bit is supported as well, though 4 and 8 bit is not.
Claw features and assembly language and an high-level language that will compile to CXE executables that Claw can execute. On any platform, as Claw executables are binary compatible!
So you focus on your code and write and compile it once. Claw will deal with libraries and device specific hardware! But this freedom is of course still limited by hardware factors (e.g. it needs a screen to display graphics, or processor speed and RAM size).

Memory
Claw is entirely stack-based and uses three main stacks. There is a call-stack which stores the source address every time you call another function. As Claw has file system support built-in you can of course call functions from libraries dynamically. Dynamic linking is performed by Claw automatically, so you don't need to worry about library versions. The next stack is the work stack. It stores the function arguments, is used for arithmetic and other instructions and is arbitrarily accessible. The third stack is the array pool which programs use to allocate parts of memory to store data to. New memory can be allocated at any time but as in a stack, only the last array can be deallocated at a time. There cannot be holes in the memory map.

Supported platforms
The architectures and systems that I will port Claw to include but are not limited to 32- and 64-bit Intel and AMD computers running Windows, Linux or BSD; AVR (especially Arduino Uno and Mega); MIPS (routers); ARM and XTensa (ESP8266). Somebody might want to do a port the TI-84+CE and I might do a port to the TI-84+ if I find time.

Online IDE
There is also a planned online IDE that includes an online compiler as well as a standalone, offline compiler and assembler that runs on Windows, Linux, and BSD. The assembler also works for Mac OSX.

Open source
Claw is an open source project and is on GitHub: https://github.com/muessigb
The chosen license is as permissive as it gets: The "new" / "revised" 3-clause BSD license.
This means Claw can be used free of charge, can be modified and used in closed source and commercial projects, as long as the name is not miss-used and I get credited appropriately.

Development status
I have completed most of the assembler so far and a good part of the VM. The compiler is just a draft so far, but will be started soon. Most parts of Claw that are not yet made are already documented and planned but just not yet written.

Have fun!
This looks promising. I will wait until tests/demos are released.
This sounds awesome. But I'm a little perplexed by the choice of memory model. Most languages/VM's go for the simple one call/scoped data stack (per thread) and one heap for non-scoped data. It seems like splitting return pointers, scoped data, and arguments (which are really just scoped data) across three stacks would be inefficient, especially if the executable were to be compiled rather than interpreted. It also forces you to pick fixed sizes for each stack ahead of time, which makes it more likely that any single stack could overflow. And while a heap is a more complicated data structure than a stack and is more expensive to maintain, being able to allocate non-scoped data seems essential for any mildly advanced program.
Runer112 wrote:
This sounds awesome. But I'm a little perplexed by the choice of memory model. Most languages/VM's go for the simple one call/scoped data stack (per thread) and one heap for non-scoped data. It seems like splitting return pointers, scoped data, and arguments (which are really just scoped data) across three stacks would be inefficient, especially if the executable were to be compiled rather than interpreted. It also forces you to pick fixed sizes for each stack ahead of time, which makes it more likely that any single stack could overflow. And while a heap is a more complicated data structure than a stack and is more expensive to maintain, being able to allocate non-scoped data seems essential for any mildly advanced program.

While you are right with this, the array pool is shared across the individual processes and might be reimplemented in a block-based manner and not a stack based allocation. All stacks in claw except for the callstack can and shall be accessed in an arbitrary way and every element is accessible. Claw also allows selection of an entry by either absolute index from the bottom or relative from the top. It does also allow resolving a relative to an absolute one to later allow for accessing that exact entry.
About stack sizes, they are defined on compilation of the virtual machine and not on runtime.
Bump, so I have a question.

So, how many levels of recursion shall there be maximal by default?


Also, I worked some more on the assembler which is really starting to get shape.
I'm concerned about a language imposing a maximum depth on recursion, other than what is permitted by the available stack space (unless that's what you mean)? The necessary maximum (or minimum, I suppose) recursion depth would vary widely by application, especially since in most languages with function calls, the size of each stack frame depends on the number and types of arguments to the recursive function.
KermMartian wrote:
I'm concerned about a language imposing a maximum depth on recursion, other than what is permitted by the available stack space (unless that's what you mean)? The necessary maximum (or minimum, I suppose) recursion depth would vary widely by application, especially since in most languages with function calls, the size of each stack frame depends on the number and types of arguments to the recursive function.

Well there has to be an assured minimal call stack size so programs don't crash on lower-end platforms. The call stack is separate from the main stack and both can and will be sized platform specific. This means there will be a minimal assured configuration and everything else can be probed and therefore allows the program to use all the platform specific features.
I fixed a ton of bugs since the last post and added quite a few new features.
Also I finished the mathematical part of the preprocessor.

So I need your help now, by testing the robustness of the preprocessor.

The latest snapshot can be downloaded here:
http://claw.bmuessigb.eu/dl/

It can be run on Linux by:
mono ./csm.exe ./Sample01.csm
or on Windows:
csm .\Sample01.csm

Please try breaking it in a mathematically and syntactically valid fashion. I would really apprechiate that.

Have fun!
I got this error:
[code]
Intentional Error: Unsolved: -4, Parentheses: -4, Solved: -4
on Line 25, in File ".\Sample01.csm"
seanlego23 wrote:
I got this error:

Code:

Intentional Error: Unsolved: -4, Parentheses: -4, Solved: -4
 on Line 25, in File ".\Sample01.csm"


That's great!
As you might see, it's an intentional error, meaning it was triggered by the to be assembled code intentionally.
What would be great, is if you would modify the equation in #def sign to anything you like, as long as the entire equation is enclosed in parantheses. If your mathematically correct equation breaks, please tell me. It will give you the result of the equation in the Unsolved field of the error.
If you could test it for stability, this would really help Smile
I FINISHED THE PREPROCESSOR! Very Happy
It can do, expressions, hexadecimal escapes, defines, undefines, nested if's, ifdefs, elseifs, custom errors, file includes and more.
Also the error handling is totally finished and displays nicely formatted errors.
Feel free to test and download the version from here:
http://claw.muessigb.net/dl/clawsemble-16_aug_09-a.zip
I have finished the instruction list. A lot was already there but I have finally completed it.
Can you PLEASE have a look if I am missing anything and maybe suggest some improvements?

Here is the full instruction list:
https://docs.google.com/spreadsheets/d/1Tfi8z7maQLM3RUHfQpnS50gAkGyfiizkQ54TTZGl9e4/edit?usp=sharing


Also, here's a little Clawsemble preprocessor overview for all of you:

Let's start with the conditionals.
They can be nested as much as desired and each if can have an unlimited amount of elseif's/elseifdef's and elseifndef's.
There can only be one else which is optional and has to be the last element in the if block.
#if (expression) is used to only process the enclosed code if the condition is met, or the number or string inside the parantheses is non-zero
#ifdef defininition is similar to the above but the condition is met, if a variable with the name is defined
#ifndef definition as above, but condition is met, if the variable is not defined
#elseif (expression) is used to provide an alternative, if the previous condition(s) is/are not met
#elseifdef definition as above, but the condition is if a variable with that name is defined
#elseifndef definition as above, but the if the variable is not defined
#else if the condition is not met and the elseif's don't met their conditions either
#endif is used to close the if-block

Expressions are simple.
All C arithmetic operators are supported and are using the same symbol, except for modulo, which is // now.
Numbers can be written with +/- prefix to show whether they are positive or negative. Negative numbers do
not need to be enclosed in parantheses. You can easily write something like: (-1 + -2 -3 - -4) and you can
even omit the spaces and Clawsemble will still be able to deal with it correctly.
Using the negative sign to negate an expression is NOT supported though. Use *-1 to do so instead.
Expressions should always be enclosed in one set of parantheses ()
Hexadecimal numbers are prefixed with an $ABAB, characters are written like this %d.

Multiple files work too.
Just use #include "filename.csm", to include other source files. They are just inserted into your main program at the position
of the #include statement and have access to the same #defines as your main program.

You can define variables.
Use #define MY_VAR 1234, #define YOUR_VAR %d, #define HEXVAR $DEADBEEF or #define ANOTHER_VAR (123 + $1242 * 123) to define your own variables
that you can check later on with the #if statement and it's companions. #undefine is used to delete a variable from memory again.
Variables can also be redefined at any point by simply calling #define again.
Be careful with the names, as they are case-sensitive. Also, all exact word-occurences of the variable name in the program are
automatically replaced by the value that the variable has at that point of time. If it changes later on in the assemblation,
the value won't change again.

Custom errors are easy to use.
Let's say, you have some conditions that have to be met for a program to be assembled correctly and you would like to prevent
the user from such accidents then you can simply use the #error statement to abort the further assemblation of the program.
You can even display an error message, the result of an expression or the contents of a variable by passing it after the #error statement:
e.g.: #error ("The architecture " + ARCH + " is unfortunately not supported at the moment!")

Make it short.
Many shorthand forms are available to save you typing:
#error, #err
#define, #def
#include, #inc
#elseif, #elif
#elseifdef, #elifdef
#elseifndef, #elifndef

Comment your code.
You can use the ; character to comment out lines if you don't want them to be parsed.
Everything after the start of a comment until a newline character is ignored and will not be processed.
You can put the comment at the beginning or the end of a line. Multiline comments are not supported.
Comments can be disabled by putting a \ in front of the comment. Like this: \;
The \ character can also be used to break single line codes into multiple ones:
e.g.: #define MY_COMPLEX_EXPRESSION \
(123 + 124 + 23230 + 12040 + ONE_OF_MY_VARS + 12332 * 124224 - 20 + \
1402 + $BEEF)


This tutorial will be finished later on, when more of the assembler is done and ready for primetime.
To be honest, I don't really buy this anymore; it sounds like a massive case of reinventing the wheel. I'd rather work toward an LLVM backend.
oldmud0 wrote:
To be honest, I don't really buy this anymore; it sounds like a massive case of reinventing the wheel. I'd rather work toward an LLVM backend.

I have zero interest in using LLVM at all. We want to build VM, compiler and assembler ourselves to improve all the aspects that we want to design in different ways from how things are often done. The project is useful for resource constrained platforms and our own hardware projects. But in the end it's our decision in how we do it. The compiler makes use of a parser generator btw.
I have finally finished Claw's instruction set Smile

You can find a list with all the instructions and their descriptions and parameters here:
https://github.com/muessigb/Clawsemble/wiki/Instruction-set
This looks like assembly. But it looks like you did a lot of work so I congratulate you. What's the difference between this and assembly?
seanlego23 wrote:
This looks like assembly. But it looks like you did a lot of work so I congratulate you. What's the difference between this and assembly?

Thanks, I worked on this list for basically a year now and it is finally finished Smile

The basic difference between this assembly language I developed and one like the Z80 is, that I am using a lot of macro instructions. This means, one instruction does a lot of things to speed up the overall performance of the programs. Let's say you want to get the bigger of two numbers from the stack, Claw can do that in one MAX instruction while a Z80 would take more. Claw is supposed to run on a lot of devices of which most are embedded ones that have very few RAM and often don't allow executing from RAM at all. To keep the speed of my virtual CPU up, and to compensate the IO input and the processing overhead, these macro instructions do things as fast as the CPU on the device can. Outsourcing as much as possible into the VM allows gaining speed that would otherwise be lost (e.g. by emulating a foreign RISC CPU). I am also building on the fact that most embedded devices have quite a bit of non-volatile memory and therefore can hold these macro instructions and don't care too much about code size. Of course I am not trying to bloat it either. This vital balance took me some time to achieve.
Soon my assembler for this custom language is done and then I resume work on the virtual machine, which will also be ported to the 84+ and possibly it's color successor.
Is it, or will it be in the future, able to be used in a GUI?
seanlego23 wrote:
Is it, or will it be in the future, able to be used in a GUI?

I am unfortunately not sure which part of Claw you are referring to, as Claw has a two parts, similar to Java.
First, there is the actual writing and compiling of code and then there is the Virtual Machine that runs the code.

If you were referring to writing Code with a GUI, that is something I might do: write a little IDE for writing Claw code.
About the VM, no, it will be able to display graphics and everything the supported platforms can do, but Claw will not display windows. I am probably going to use an SFML window manager to display debug information and allow tuning stuff in the PC VM, but apart from that, there won't be UI on the VM part.
Hooray! The assembler is finally completed Smile

If anybody would test it, I would be really glad. Messing around with it would help quite a lot.
I will write some more extensive tutorials soon Smile


The assembler can be downloaded here. This file (even though it says .exe) run fine on Linux, Mac and Windows.
It, however, requires Mono to work on all of the three platforms (on Windows MS.NET Framework should work well too).
http://data.bmuessig.eu/Projects/Clawsemble/alpha-160923a/csm.exe
http://data.bmuessig.eu/Projects/Clawsemble/alpha-160923a/Sample.csm
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 2
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement