Kerm, I do indeed have my W|A AppID (which reminds me, I should really start playing with the python binding now).

The general consensus seems to be that XML is the way forward, so I now have a few questions.

- What are the 245 tokens I need?
- Should I use shaun's TokenIDE XML tokens list? (If not, should I make my own?)
If you just use Shaun's TokenIDE XML, then you don't need to worry about limiting yourself to a subset of the tokens. Just pull in the whole XML file and call it a day.
[UPDATE #6 -14/08/13] - Adventures In XML

Wow. There has most definitely been a lack of progress, and for that, I apologise profusely. I have decided today to pick up the baton once more and return to developing WAti.

After lots of advice from lots of knowledgeable cemetechians, I have scrapped the idea of using dictionaries to detokenize the calculator's output and have instead opted for XML. Luckily for me, merthsoft has taken a lot of hard work out of the process and has created an XML sheet with all of the tokens on it (for use with TokenIDE). Now all that's left for me to do is to parse it.

In terms of parsing, I will be using elementtree. I am now armed with some good documentation in the form of python docs (I thank Tari for this useful link), so I am ready get going with this detokenizer function for the python bridge.

If anyone can recommend any more tutorials/documentation for the said libraries, do not hesitate to post.

Keep your eyes peeled for further progress.
[UPDATE #7 - 8/9/13] - Slowly But Surely...

Nearly a month ago I said I was experimenting with XML, a month later, I still am. But hey, that's how it goes. I'm having a few issues with my XML parser due to the namespace on Tokens.xml, but I'm on my way to getting it sorted.

In rather more positive news, GUI development has started today, and progress is good. Very good, in fact. The GUI is functional and gives the user a search interface and a very special new feature...

elfprince13 wrote:
Quote:
Will you provide the ability to cut and paste formulas to and from the homescreen?


I have decided to add elfprince's idea to give the user more mathematical freedom, and the development that has enabled me to do this is the use of the TokenIDE XML sheet because it lists all of the tokens on the TI-83+/84+ platform.

Screenshots:


Cool Smile Btw, how does "WAti" look if you change it to "Wαti"? (that's an alpha - 0x80, instead of 0x41 for "A").
Not sure, I'll have a look. Do bare in mind that I write it as WAti, because WolframAlpha is written with two capitalized letters. I'll definitely have a look though.
I like Elfprince's suggestion; it looks less like you accidentally held [SHIFT] for too long. Plus, of course, alpha for Alpha. Smile And nice job on the homescreen progress. Don't worry about the namespace in the XML. You can basically ignore that as far as your XML-parsing code is concerned.
I think he was running into issues with the XML library he was using where it was buckling because of the namespace. I think Python (or whatever he's using in Python) requires it to be specified if it's part of the file.
[UPDATE #8 - 11/9/13] - W|A In The Terminal

A pretty self explanatory update today. A while ago I mentioned that I had got my W|A AppID - and that was it. I had previously done nothing with it, however, I'm very pleased to say that I've got the python binding working and generally everything is running quite smoothly.

I must say though, there seems to be a crippling lack of documentation for the said python binding, but, through perseverance and playing in the terminal, we have lift off! Smile

Screenshot of the day:

Not particularly interesting, but it proves W|A is now working in python. All that's left to do now is get the detokenizer working, do some data transmission and reception from calc to pc and polish the GUI up a bit.

Exciting times Smile

Quote:
I think he was running into issues with the XML library he was using where it was buckling because of the namespace. I think Python (or whatever he's using in Python) requires it to be specified if it's part of the file.


merth, you're right. I just need to find a way of specifying it in the parser, and I'll be ready to go with the detokenizer routine.
Here's my code for parsing Merth's tokens file with ElementTree:


Code:
#!/usr/bin/env python 
import xml.etree.ElementTree as ET 
import sys

def get_byte(attrib):
   return int(attrib['byte'][1:],16)

def concatenate_bytes(tokbytes):
   ret = 0
   mpow = len(tokbytes)-1
   for i,byte in enumerate(tokbytes):
      ret += byte * 256**(mpow-i)
   return ret

def cleanup_chars(string):
   trouble = dict(   (i,repr(c.encode('utf-8'))[1:-1])   for i,c in enumerate(string) if ord(c) >= 128 or c == "\\")
   if trouble:
      string = "".join([c if i not in trouble else trouble[i] for i,c in enumerate(string)])
   return string

def emit_token(string,tokbytes,raw_mode=False,rootattrs=None):
   if string == r'\n' and not raw_mode:
      string = r'\n|\r\n?'
      tlen=1.5
      quotes = False
   elif string == "" and not raw_mode:
      string = "<<EOF>>"
      quotes = False
      tlen = 0
   else:
      quotes = True
      tlen = len(string)
      string = cleanup_chars(string)
   string = "".join([i for i in ['"',string.replace('"',r'\"'),'"'] if quotes or i!='"'])
   return (tlen,string,tokbytes,rootattrs) if raw_mode else ((tlen,'%s\t{\treturn 0x%X;\t}' % (string, concatenate_bytes(tokbytes))))
      
def add_all_tokens(down_from,tokens,byte_prefix,raw_mode=False):
   for token in down_from.findall("{http://merthsoft.com/Tokens}Token"):
      bp=byte_prefix+[get_byte(token.attrib)]
      if 'string' in token.attrib:
         tokens.append(emit_token(token.attrib['string'],bp,raw_mode=raw_mode,rootattrs=token.attrib))
      for alt in token.findall("{http://merthsoft.com/Tokens}Alt"):
         tokens.append(emit_token(alt.attrib['string'],bp,raw_mode=raw_mode,rootattrs=token.attrib))
      tokens = add_all_tokens(token,tokens,bp,raw_mode=raw_mode)
   return tokens

def getET(filename):   
   ET.register_namespace("","http://merthsoft.com/Tokens") 
   return ET.parse(filename).getroot()

root = getET(fname)
tokens = add_all_tokens(root,[],[],raw_mode=True)


I also have some other routines for classifying them, but that's probably less relevant for you.
[UPDATE #9 - 16/10/13] - Namespaces & CalcEncoder

WAti is not dead! I made the best of a 'free evening' and developed a better understanding of XML parsing in python.

As you're probably aware, I had been struggling to register namespaces with Tokens.xml. I turned out the code was as simple as:


Code:
ET.register_namespace("","http://merthsoft.com/Tokens")


Many thanks go to elfprince for distributing the extract of code above this post. It's definitely given me a push in the right direction and helped me understand the coding in the context of my program and Tokens.xml. I won't be copying all of the code, but it serves as an excellent source of reference and inspiration.

The major stumbling block of this project has been detokenization. In order to get around this, I plan to create a python module for use in my program (and indeed anybody else's) to detokenize/tokenize the streams coming from the calculator and the streams going to the calculator from the computer.

The general structure of the module will be:

Code:
imports
preliminary XML stuff (defining namespaces, making a tree etc)
detokenizer routine
tokenizer routine


I may consider adding a 'Token Dump' method, but I'm not sure yet. I think it will be best to get the basic functionality in there first.

The idea is that this module will be independent from gcnskel.py and as well as functioning in my program, it will serve as a community detokenizer/tokenizer python module which will be free to use in any project.
Keep up the good work. It sounds like you're stuck on the detokenizer/tokenizer module, but that it's mostly a problem of not knowing where to start? It sounds like you need to iterate through the XML, creating a table of string->token byte mappings and another table of token byte-> string mappings.
You're absolutely welcome to just cannibalize my whole tokenization routine if you want to (as long as you give appropriate credit). I have that whole file on my GitHub, iirc. The part I posted is just the XML parsing part, but I also have the tools for tokenization pretty well developed.
Well, thank you very much, elfprince. Your code has been very useful in teaching me how elementtree actually works. I will be sure to credit you in the source code even if I don't use the code, because you have been a tremendous help to the project, as has Kerm.
https://github.com/elfprince13/TITokens/blob/master/titablegen.py

You can probably exlude the token classification stuff and the Luddite stuff, since you don't need syntax highlighting and aren't interfacing with Komodo Edit.

Also, I don't know if you want to have to mix C and Python, but if you don't, the grammar I generate for flex should give you a good start into writing Python code to interface with Plex instead, which is a rough Python equivalent (but does have slightly different pattern matching, but once you see how my code works, you should be able to figure out a translation if you go that route).

[edit]

The coolest thing, if you end up using Plex, would be if you modify titablegen to be able to output the grammar in either Flex or Plex format. If you do, fork me on GitHub, and I'll be happy to incorporate that into the master version of titokens.
elfprince, that does sound good, but for simplicity's sake and the fact that I've only just got to grips with XML parsing, I'll pass for now, though that's not to say I won't include it. Smile
*BUMP!

[UPDATE #10 - 22/10/13] - Dumping The Tokens

I managed to find some free time this evening and decided to use my newfound XML parsing knowledge to create a little program that iterates over all of the tokens, including the double byted ones and puts them into a nicely formatted list. You can find the result on pastebin

The < $xx > indicates the first byte of a double byted token (such as the lowercase letters).

My hope is to apply these skills to create a couple of functions to go inside a module to tokenize and detokenize. Since tokenizing is the opposite of detokenizing, it should be pretty simple to pull off (or so I hope).
I think that in order to get maximum size optimization, you should use hardware, which I am in the process of making, to send an actual query to Wolfram, and then display the actual result through a parser on the calculator
Halterer wrote:
I think that in order to get maximum size optimization, you should use hardware, which I am in the process of making, to send an actual query to Wolfram, and then display the actual result through a parser on the calculator


Whilst it's an interesting idea, WAti will be using calcNET and will use calcNET hubs, simply because most users will have a calculator with DCS7, a means of linking to cN and a computer to link the calculator to. Furthermore all of the parsing will be done on the computer, so the calculator program will actually be relatively small.

In terms of this hardware you are planning to construct, which microchip/developer platform will you be using?
ElectronicsGeek wrote:
Whilst it's an interesting idea, WAti will be using calcNET and will use calcNET hubs, simply because most users will have a calculator with DCS7, a means of linking to cN and a computer to link the calculator to. Furthermore all of the parsing will be done on the computer, so the calculator program will actually be relatively small.
I think your design makes a lot of sense in terms of the limitations of the platform and minimizing what people need to build/buy/download/install in order to use WAti.

Quote:
In terms of this hardware you are planning to construct, which microchip/developer platform will you be using?
I'm equally curious about details.
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
» Goto page Previous  1, 2, 3, 4, 5, 6  Next
» View previous topic :: View next topic  
Page 3 of 6
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement