for the past few days i've been writing a text interface tokeniser|detokeniser for BASIC|Axe|Grammer|etc 8x programs because i wanted something that could easily be used in scripts, and have it to the point of being fairly functional:



at the moment, i'm trying to implement reading file input from stdin, but am not sure how to go about it. when reading from a flat file, i just malloced a flat buffer of the file's size and used that for my reading and processing of things. i tried mallocing a small chunk, scanning through stdin and copying over everything (reallocing to larger chunk size when necessary), and breaking the loop when either i reached an EOF or hit the max allowed size, but it doesn't appear to be working, and i wanted to get a second opinion before spending more time on this.

oh, and thanks to merthsoft and everyone else who helped in making TokenIDE's xmls. they've been a big help as a reference =)

EDIT: huh, i hadn't even noticed that elfprince was already working on something very similar. regardless, i think i'll go ahead and finish this for my own benefit (and for anyone who may want a standalone C executable using only standard libs)
Out of curiosity, what distinguishes this from TokenIDE or SourceCoder 3? I hear the point about being useable in scripts, but I can't say I can think of a lot of scripts that need to tokenize or detokenize TI-formatted files.

Quote:
at the moment, i'm trying to implement reading file input from stdin, but am not sure how to go about it. when reading from a flat file, i just malloced a flat buffer of the file's size and used that for my reading and processing of things. i tried mallocing a small chunk, scanning through stdin and copying over everything (reallocing to larger chunk size when necessary), and breaking the loop when either i reached an EOF or hit the max allowed size, but it doesn't appear to be working, and i wanted to get a second opinion before spending more time on this.
You could even mmap() the file, which is a pretty clean way of doing things. Or maintain a buffer, and just refill the buffer whenever you're within x bytes of exhausting it.
Cool this is pretty similar to my command line project, it looks like. What sorts of lexing algorithms are you using?
my code is pretty simple. it first reads all the text into a constant buffer. it initialises a cursor pointer to point to the head of that buffer and then searches through the token set list the user defined, trying to find the longest string, beginning at the cursor, that is also a string associated with a token in that set. when it finds one, it jumps the cursor ahead strlen(matched string). i defined additional sets (Axe, Grammer, etc) as deltas from the default BASIC set, so that matches in them will override matches in the default set. then, if no match is found in the deltas, it searches in the BASIC set instead.
the matched strings, and their associated token bytes (the second is defined as 0xFF if it's a one byte token, as no token has that as its second byte) are stored into a linked list that is walked through afterwards to get the result values. the actual matching code is here.

KermMartian wrote:
Out of curiosity, what distinguishes this from TokenIDE or SourceCoder 3? I hear the point about being useable in scripts, but I can't say I can think of a lot of scripts that need to tokenize or detokenize TI-formatted files.


i started making this for several reasons:
*i wanted to have a development environment which integrated well with linux (TokenIDE's GUI is a bit clunky under mono and doesn't blend with the rest of the system), and this seemed like the first step towards doing so.

*i like text interfaces for simple tasks like this for processing text files.

*it seemed like it could be useful for quickly greping values in really long programs i was writing without having to first launching up a gui of some sort.

*i was bored.

*i thought someone else may benefit from it.

in that order Razz
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 1
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement