Well, so we've had a chat on SAX today.

I need a routine which gets a byte from a specified variable and outputs it to decimal format.

I came up with:

Code:
.PROGRAM
GetCalc(Ans)->P
GetCalc("var/THETA/")->O
{P+O}->Ans


Now I need to have the correct prefix in Ans prepended to the var name. Here are tokens, and based on another reference which I have on paper, I can find out which to use. I'd prefer using the column for "4", however that would require some masking to get the correct prefixes - could anyone help me specifically on that? I know what masking is, but how would I mask the first byte in Ans?
*bump* Sorry for being impatient, but I just got a frame of time where I really have time and will to program, and I'd like to get as much done as possible. Could anyone please give me some quick advice?
The first three sections of the User Variables lesson from asm in 28 days will probably be helpful. It describes how a name for a variable of most useful types should be formatted.
Well, since I first posted this, many things happened, including the decision that I am writing the program, that I only needed an Axe lib for, fully in Axe now. I figured this Basic - Asm interaction would be harder to do than all of it in Axe right away. So here we go.

Do you have any suggestions on how to make the user select a file (multiple files? Oncalc .zip?) to compress?
Rather than writing a UI, which could be rather cumbersome, designing the program like a command-line tool is probably easiest. A relatively simple and common format would be a space-separated string (in Ans), with the first "argument" being the desired name of the compressed output variable and all following "arguments" being the names of variables to compress into the single output.
Runer112 wrote:
Rather than writing a UI, which could be rather cumbersome, designing the program like a command-line tool is probably easiest. A relatively simple and common format would be a space-separated string (in Ans), with the first "argument" being the desired name of the compressed output variable and all following "arguments" being the names of variables to compress into the single output.
I would like to have the user choose the variable type via some sort of UI (Not necessarily graphical), as that would prevent some hassle with input sanitisation and processing...
But it may be just my opinion, as I am very inexperienced.
I assure you that the amount of logic needed to deal with types in string input is much simpler/smaller than the logic needed to make a UI listing all available variables of all available types.

And if you go with this method, for cleanliness, I would suggest doing away with the prefix type byte thing that was discussed over IRC. The prefix byte can be relatively easily inferred from the first byte of the variable name. If you planned to support variable types not expressible with the BASIC token set (such as appvars), you'll probably still have to require a made-up prefix token to signal the type, but you could make it a much more sensible token rather than whatever token value happens to map to $15 when ANDed with $3F.
I decided to implement a sorting algorithm before I do the main program - for two reasons. First, I'll need one in the project, and I'll need greater control over it than what the Axe command offers. Second, it will probably be a good exercise before I'll start for real.

Now, I consider to implement either Selection Sort (primary choice), Shaker Sort (secondary choice) or Bubble Sort (tertiary choice). I prefer the first of them due to its ease of implementation and I believe it is the fastest of the three. The Shaker Sort would be a backdrop, in case I fail to implement Selection Sort. And finally, if that fails too, I would implement the easiest but slowest of the three.

For my use case, I will need them on a list of maximally 256 elements, I expect the distribution to be really uneven and random. For every operation of the sorting algorithm, I'll need to perform the same on another set of data (I need to sort a table with two columns, essentially), so every iteration will take longer than usual. This would also be a big argument for in-place algorithms.
What algorithm would you recommend for this (small - medium set of data, lots of processing per iteration)?
Nik wrote:
I decided to implement a sorting algorithm before I do the main program - for two reasons. First, I'll need one in the project, and I'll need greater control over it than what the Axe command offers. Second, it will probably be a good exercise before I'll start for real.

Now, I consider to implement either Selection Sort (primary choice), Shaker Sort (secondary choice) or Bubble Sort (tertiary choice). I prefer the first of them due to its ease of implementation and I believe it is the fastest of the three. The Shaker Sort would be a backdrop, in case I fail to implement Selection Sort. And finally, if that fails too, I would implement the easiest but slowest of the three.

For my use case, I will need them on a list of maximally 256 elements, I expect the distribution to be really uneven and random. For every operation of the sorting algorithm, I'll need to perform the same on another set of data (I need to sort a table with two columns, essentially), so every iteration will take longer than usual. This would also be a big argument for in-place algorithms.
What algorithm would you recommend for this (small - medium set of data, lots of processing per iteration)?

You may want to have a look at a golfing contest that took place a few months ago here on Cemetech. People were asked to create an integer sorting program in pure basic that did not use the SortA() and SortD() commands. I believe some entries actually managed to be faster than the OS`s routine, and that was pure basic. Some people implemented proper algorithms, and others kind of made one up to fit the specific case (which was randInt(1,256,256))
EDIT: here is a link
Humm, I don't think the actual code is in the topic you linked... Do you know where I could find it?
The numbers I see certainly look interesting.

Edit: I found at least this topic. I couldn't find the rest of them. Although lirtosiast seems to have had the most efficient sorting routine, I am not sure if the implemented algorithm suits my needs, so I'd like to have a look at the other routines too...
I made some big breakthroughs with the algorithm, but I encountered a problem which seems impossible... Sad

So, to generate the Huffman tree, I am using a set of interlocked pointers, which can be resolved into a string representing the preorder traversal of the tree. In the case of Huffman trees, this preorder traversal suffices to reconstruct the whole tree later on. However, to resolve the pointers to a string, I need a final instance of data somewhere, and this is where problems arise.
Normally I would have a single byte of data, which is 1) fixed length and 2) easy to operate on even in large lists. But, which resolving the pointers to a tree, I need variable length strings. This would be a small problem too, if not for the fact that all strings together take up up to 512 bytes, but any of the substrings could hold any portion of those 512 bytes. Shortcuts which make the strings of equal length won't do here... Do you have any suggestions?


Edit: Okay, nevermind that. Some pondering and "simulations" involving "programming with pencil and paper" and I think I came up with a concept to circumvent that issue. Now I need to come up with a concept to do it efficiently.
Is there a way to have normal variables (A - theta) but with custom names? I am really confused from my program because the variable names are so undescriptive, I had to write them down on a piece of paper...

Also, can I store things to a two byte variable this way? I sometimes need a two byte counter for quick access but I am using safeRAM areas for this currently.

And lastly, while we are at this, where are those variables saved? In some safeRAM area or do they use normal RAM during program execution?
Okay, I think I am mostly through with generating Huffman trees - but I am stuck with debugging... :/

Could someone take a look at this? (Note: Comments for a line are always before that line. Just mentioning because there are different preferences to that, and its confusing. Note 2: Deliberately removed all apostrophes (') from the comments because they made the syntax highlighter think they started strings, completely screwing it up.)


Code:
.HUFFMAN
ClrHome
"appvHUFFTEMP"->Str0
.Initialize L4 as an area for variables
Fill(L4,256,0)
."Stack" for storing the trees elements - explained later in detail
L4->°STACK
."List" for temporary storage while generating the tree - explained later in detail
L4+2->°LIST
.Pointer to the var to be compressed
L4+4->°FILE
.Size of the var to be compressed
L4+6->°SIZE
.A variable designating the number of still valid entries in the list
L4+8->°COUNTER
.Pointer to the end of the stack
L4+10->°OFFSET

.Generates appvar for the list and stack
If GetCalc(Str0,3584)->STACK
  .STACK is 2048 bytes big, list is 1536 bytes big
  {L4}^r+2048->LIST
  Fill(STACK,3584,0)
  .L1 will be used for temporary processing later
  Fill(L1,512,0)
  .Get the prefixed variable name through input, process for GetCalc(
  input->A
  Copy(A,L1,length(A))
  .The "*10^n" "E" designating hex numbers, not the letter "E"
  {L1} and E3F->{L1}
  GetCalc(L1)->FILE

  If {FILE-2}^r->SIZE
    ...
    Now the LIST structure needs to be explained.
    It is a "table" or a list of entries of six bytes each. It will medium-term hold every element of the Huffman tree, basically the list from which the tree is constructed, the one which is sorted.
    It stores the characters number of occurrences in bytes 0 and 1 of each list entry, one part of the data in bytes 2 and 3 (This is either a character or a pointer to the stack) and the other part in bytes 4 and 5 (This is either 65535 (impossible for pointers) designating that bytes 2 and 3 are a char, or another pointer to the stack.) If it is not 65535, we know that the list entry is a node which points to the stack, if it is, we know it is a leaf which holds a character.
    Note: A character in our case is a byte. Just makes sense using bytes here.
    ...
    .Now, this loop counts every "character" and stores their frequency in the "list" at bytes 0 and 1. The list can then be accessed as follows: "Character" 0000 0000 is entry one, character 0000 0001 is entry two, and so forth.
    For(A,0,SIZE-1)
      {6*{FILE+A}+LIST}^r++
    End

    .This loop checks every character whether it occurs at least once. If so, it is properly initialized, if not, it is skipped.
    For(A,0,255)
      .Caching the long pointer for saving memory
      6*A+LIST->P
      .If the characters frequency at {P}^r is greater than 0, it occurs in the file
      If {P}^r
        .Store the character in the following "slot", so it can be checked even after the list was sorted
        A->{P+2}^r
        .The character, or leaf, "marker"
        65535->{P+4}^r
        .Update number of list entries
        COUNTER++
      End
    End

    .Finish initialization by sorting the array. This is needed because the loop only sorts a part of the array, for optimization. But before the loop starts, the entries can be scattered all over the list, so this is required to get them all to the top.
    SORT(256)

    .This is the main loop which iterates as long as there is more than one entry in the list left.
    While 1
      .Now, this was more complicated (at least for me). It adds the lists top (Least occurring) element (4 bytes if skipping the frequency, which will not be needed) to the stack and updates the stacks pointer.
      Copy(LIST+2,OFFSET++-1*4+STACK,4)
      .Same as above, but for the second list entry. At this point, the lists two top entries are safely stored in the stack and the list is ready to be manipulated.
      Copy(LIST+8,OFFSET++-1*4+STACK,4)
      .Start generating a node for the tree, which will have the two least occurring elements as children. Add the two childrens frequencies and save to the first entry (overwriting it).
      {LIST}^r+{LIST+6}^r->{LIST}^r
      .Store the pointer to the first child (in the stack now)
      OFFSET-2->{LIST+2}^r
      .Store the pointer to the second child (in the stack now)
      OFFSET-1->{LIST+4}^r
      .Now the first list entry is holding the new node, the second one is cleared so that it does not hold the stacked entry anymore.
      Fill(LIST+6,6,0)
      .Sort only the part of the list which still holds entries
      SORT(COUNTER-1)
    EndIf COUNTER--
    .Irrelevant debug code removed
  .This statement belongs to "If {FILE-2}^r->SIZE", i.e. executed if the size was 0.
  Else
    Disp "ERR:SIZE"
  End
.This statement belongs to "If GetCalc(Str0,3584)->STACK", i.e. executed if the appvar failed to create.
Else
  Disp "ERR:MEM"
End
Return

Lbl SORT
.Selection sort implementation, works correctly, not worth commenting further
For(A,0,r1-1)
  A->C
  For(B,A,r1)
    If {6*B+LIST}^r>{6*C+LIST}^r
      B->C
    End
  End
  Exch(6*A+LIST,6*C+LIST,6)
End


So, this code should be generating a Huffman tree in the area {STACK}*, but for some reason it a) is not the correct tree if reconstructing the broken entries, and b) It contains weird entries like "43xx something" or both children are 0, which are both impossible.

*The encoding works as follows:
1) The first pointer element is still in {LIST+2}. It can be used as the tree's root.
2) An "Element" are always 2 + 2 = 4 bytes.
3) The first two bytes are either a character or a pointer.
4) If the first two bytes are a character, the second two bytes are 65535, else they are another pointer.
5) A pointer, if multiplied by 4 and added to STACK points to another element.

From this, we can reconstruct a tree by starting from the root element in {LIST+2} and then jumping from one pointer to the next. An element containing pointers is a node, an element containing 65535 in the second two bytes is a leaf, and its value is in the first two bytes.
*bump* I would ask if some Axe expert and/or someone familiar with Huffman Coding could help me review this code - I just can't seem to find why it messes up. Please post if I did not explain the code enough.

Code:
If GetCalc(Str0,3584)->STACK
  .STACK is 2048 bytes big, list is 1536 bytes big
  {L4}^r+2048->LIST

Not a bug, but can be optimized by removing {L₄}ʳ since that value is still in Axe's "ans."


Code:
Copy(A,L₁,length(A))

When you use input, it returns a pointer to a TI String, which means that instead of length(A) you should use {A-2}ʳ. For the same reason, you need to add the 0 yourself with and 0→{{A-2}ʳ+L₁}.


Code:
{6*{FILE+A}+LIST}ʳ++

Always put constants on the right of operators when you can (like {{FILE+A}*6+LIST}ʳ++), it significantly speeds things up.

Code:
6*A+LIST→3

Ditto.


Code:
SORT(COUNTER-1)
EndIf COUNTER--

First you can do use COUNTER-- the first time and COUNTER the second time. Second, this Ends[/mono] the loop whenever COUNTER is [i]non-zero. I think you meant End!If COUNTER.
jacobly wrote:

Code:
If GetCalc(Str0,3584)->STACK
  .STACK is 2048 bytes big, list is 1536 bytes big
  {L4}^r+2048->LIST

Not a bug, but can be optimized by removing {L₄}ʳ since that value is still in Axe's "ans."


Code:
Copy(A,L₁,length(A))

When you use input, it returns a pointer to a TI String, which means that instead of length(A) you should use {A-2}ʳ. For the same reason, you need to add the 0 yourself with and 0→{{A-2}ʳ+L₁}.


Code:
{6*{FILE+A}+LIST}ʳ++

Always put constants on the right of operators when you can (like {{FILE+A}*6+LIST}ʳ++), it significantly speeds things up.

Code:
6*A+LIST→3

Ditto.


Code:
SORT(COUNTER-1)
EndIf COUNTER--

First you can do use COUNTER-- the first time and COUNTER the second time. Second, this Ends[/mono] the loop whenever COUNTER is [i]non-zero. I think you meant End!If COUNTER.


1) - Thanks!
2) - I remember reading the exact opposite somewhere, at least for additions. Thanks a lot for correcting.
3) Oww, that should be EndIf CONSTANT--=1. Also I did move things around since posting the code, indeed moved the -- up and indeed tried removing the =1, which solved one bug but introduced another.
It added one of the missing chracters to the tree but now I have an unnecessary spam entry in the stack after running. Obviously, as I added an iteration.

The weird thing is, there are some stack entries which appear to be pointers (Non-65535-suffixed) but hold zero in both bytes. And one entry which is 43xxx/0 which I have no idea altogether why it is there. And one character is missing in the stack at all.

My test var is usually DStr1 which holds chracters A-E in a random quantity of 1 - 10. I construct the correct tree by hand at first and then run the program and compare. And it's completely wrong.

I suspect I am updating the number of characters or so in a wrong order or sort the list in a wrong boundary. But I couldn't fix it with tweaking the offsets or the order of operations.
I made some progress and basically got it to work, but I found a major optimization in my algorithm.

The problem is, it would require a fully dynamic array with delete functionality for any entry, and any entry could be of a variable number of bytes. Is that possible?
Could anyone maybe point me at some system/algorithm for that?
You might use another table/list that points to the start of the data. Thus, when deleting an entry, you delete that element, and from the main list, you shift what comes after.
PT_ wrote:
You might use another table/list that points to the start of the data. Thus, when deleting an entry, you delete that element, and from the main list, you shift what comes after.


I see, I already imagined something of the sort. An issue, though, is that I'd need to update all the pointers after the shift, which is rather inelegant.

I have considered another approach: In a secondary set of data, every element would be preceded by its size. The primary set would contain only a number, which is not a pointer, but rather the index of the element, so if I want to address the 3. element it would be 2, if I needed the 4. it would be 3 and so on. Then, a subroutine could fetch said element from the secondary set of data by skipping the bytes of the first, then the second element and so on. This still requires me to update the "pointers" after deleting an element, but it would be a fixed increment/decrement of 1.

And ideal alternative, which I am trying to work out at the moment, would of course be to avoid an array at all and perform my data manipulation "in situ", but I doubt there is a way.

What do you think?
Well, that could work as well. However, this takes a lot of time, since you need to skip a lot of entries the right entry, while in my example, you only need to fetch it once.
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 1
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement