(Our sheets have been mentioned and discussed in plenty of places over the past year, but only recently I realized that they don't have a dedicated thread. So, here we are.)
The token sheets are XML files containing copious technical information about every token on the 8X calcs. The sheets were derived from and inspired by TokenIDE's sheets, but are intended to be a more generic reference tool for tokenization and other tasks.
Unlike programs written on a computer, which are made of characters, TI-BASIC programs are composed of tokens: indivisible chunks of bytes that correspond to the many commands you see on-calc. If your cursor "skips" over it in the editor, it's a token. Tokenization is the process of converting characters (which are easy to write on your computer) into tokens (which the calculator understands). Detokenization is the reverse process of rendering all the bytes in a TI-BASIC program as text people can read.
Both of these tasks require complete knowledge, across time and across languages you wish to support, of the set of tokens to prevent future token changes from breaking existing programs. Furthermore, tokenization requires additional care to deal with ambiguous and unintended parses. The sheets' replete token information and reference [de]tokenization implementations (discussed below) are thorough and carefully constructed to support a variety of use cases.
Every token's history is kept in one file, and changes like token addition, removal, and renaming are provided in an easily parsable way. This allows applications to use just one sheet while targeting anything from the earliest 82 to the latest CE. Each token tag includes:
We highly encourage you to utilize the sheets for your next project, as we strive for the sheets to be a valuable and powerful standard reference for the 82/83/84-series calculators. We've provided several straightforward ways to use these sheets in your next project:
Errors, issues, and suggestions are more than welcome in this thread and can also be directed to GitHub or our Discord server. This project is nearly two years old, and we continue to include more information as it becomes available. Our next big project with the sheets is adding translations and developing a new sheet for font information; we would love your help!
This project had many contributors:
1This post marks sheet version 1.0; we are extremely unlikely to change the format again in a backward-incompatible way.
The token sheets are XML files containing copious technical information about every token on the 8X calcs. The sheets were derived from and inspired by TokenIDE's sheets, but are intended to be a more generic reference tool for tokenization and other tasks.
Unlike programs written on a computer, which are made of characters, TI-BASIC programs are composed of tokens: indivisible chunks of bytes that correspond to the many commands you see on-calc. If your cursor "skips" over it in the editor, it's a token. Tokenization is the process of converting characters (which are easy to write on your computer) into tokens (which the calculator understands). Detokenization is the reverse process of rendering all the bytes in a TI-BASIC program as text people can read.
Both of these tasks require complete knowledge, across time and across languages you wish to support, of the set of tokens to prevent future token changes from breaking existing programs. Furthermore, tokenization requires additional care to deal with ambiguous and unintended parses. The sheets' replete token information and reference [de]tokenization implementations (discussed below) are thorough and carefully constructed to support a variety of use cases.
Every token's history is kept in one file, and changes like token addition, removal, and renaming are provided in an easily parsable way. This allows applications to use just one sheet while targeting anything from the earliest 82 to the latest CE. Each token tag includes:
- A full linear history
- An accessible detokenization name (how do people generally type this token on a keyboard?)
- Other extant variants, which are names recognized by other tools
- Display information that captures how the token is rendered on a calculator (both in TI-Font bytes and as Unicode approximations for convenience)
We highly encourage you to utilize the sheets for your next project, as we strive for the sheets to be a valuable and powerful standard reference for the 82/83/84-series calculators. We've provided several straightforward ways to use these sheets in your next project:
- Clone the main branch directly (for easy scripting using the provided Python scripts, which we guarantee will provide a stable API even if the sheets change1).
- Clone the built, which contains validated copies of the sheets, as well as the same data in JSON format. We also produce a TokenIDE-compatible sheet for the latest version of every calculator model. Adding this branch as a git submodule is recommended for using the sheets in any long-term project.
- For Python projects which just need [de]tokenization, tivars_lib_py's tokenizer is extremely robust and has first-class support for these sheets.
- For Rust projects, the titokens crate has a complete sheet parser and a working detokenizer but only has a half-baked tokenizer.
Errors, issues, and suggestions are more than welcome in this thread and can also be directed to GitHub or our Discord server. This project is nearly two years old, and we continue to include more information as it becomes available. Our next big project with the sheets is adding translations and developing a new sheet for font information; we would love your help!
This project had many contributors:
- Every contributor to the original TokenIDE sheet (Kerm, merth, and tifreak were listed on the version we used).
- iPhoenix and kg583, who populated the sheets and provided a reference implementation and core ideas for sheet layout (most of which have survived several revisions).
- Tari, who helped us cleanly incorporate more data.
- LogicalJoe, who uncovered and fixed many token history omissions.
- The rest of the Toolkit crew (particularly Adriweb and womp), who helped spot errors and contributed to discussions that helped shape the final form of these sheets.
1This post marks sheet version 1.0; we are extremely unlikely to change the format again in a backward-incompatible way.