Speed optimization in TI-BASIC can be fun. What are your favorite tricks for speeding up BASIC programs?

I'll start this off with two things I and runer112 noticed a few months ago:

* When doing multiplications, save up to 1.4 ms by putting the number with fewer significant digits on the right. That is, π2 rather than 2π.

* Usually, ² and ³ are faster than ^, but don't use the ³ function with complex numbers; it's just as slow as ^3. Use AnsAns² instead.
Bit manipulation is often used to make operations in assembly blazing fast. In TI-BASIC, the bits of a number can't be manipulated separately. Often, then, TI-BASIC programmers instead use lists of booleans.


Code:
{1,1,0,1,1,1,0,0,1}→L₁
{0,1,1,1,0,0,1,0,0}→L₂
not(L₁             # {0,0,1,0,0,0,1,1,0}
L₁(5               # 1
L₁ and L₂          # {0,1,0,1,0,0,0,0,0}
L₁ or L₂           # {1,1,1,1,1,1,1,0,1}
L₁ and not(L₂      # {1,0,0,0,1,1,0,0,1}
max(L₁ and not(L₂  # 1
L₁ xor L₂          # {1,0,1,0,1,1,1,0,1}


But there's a faster way. Instead of using list positions 1 through 9 to store the bits, use the first 9 primes, and multiply them all together into a squarefree integer. Here are the corresponding operations with squarefree-integer lists:


Code:
{2,3,5,7,11,13,17,19,23}→L₆   # first 9 primes
prod(L₆^L₁→A         # 2*3*7*11*13*23
prod(L₆^L₂→B         # 3*5*7*17
223092870/A          # 5*17*19
not(remainder(A,7    # 1
gcd(A,B              # 3*7
lcm(A,B              # 2*3*...*17*23
A/gcd(A,B            # 2*11*13*23
remainder(B,A        # some nonzero value
gcd(A,B):A/AnsB/Ans  # 2*5*11*13*17*23


Note especially the speed of the asymmetric set difference "remainder(B,A", which takes about 4 ms compared to over 20 for the corresponding boolean-list operations. But all of the operations are fast: gcd( and lcm( take 8 ms at most, which is significantly faster than using boolean-lists.

9 bits is well within the "integer" range of TI's floats: the largest number, representing a list of ones, is 223092870 ~= 2*10^8. This method of storing bit lists as squarefree integers can be extended to 11 bits by just using more primes (2*...*31 is 200560490130, which has the maximum of 12 digits). Beyond that, lists of bits can be split up into nice chunks of 8 or 9, stored as elements in a list, and the gcd( or whatever operation can be vectorized over the list.

Will update with full explanation and timings.


EDIT: Looks like gcd( is much slower than I thought. Worst case is around 65 ms, so it isn't a viable alternative. Only remainder( (5 ms or so worst case) seems useful out of these operations.
lirtosiast wrote:
In TI-BASIC, the bits of a number can't be manipulated separately.


They can, of course. It's just a roundabout way.

Here are some common bitwise operators using real variables (yes, I know I'm somewhat cheating):


Code:
L = bit length

BIT X OF A (0-indexed, starting from right, though easily modifiable)
int(2fPart(.5A/2^X

TURN ON BIT X OF A (...)
A+2^Xnot(int(2fPart(.5A/2^X→A

TURN OFF BIT X OF A (...)
A-2^Xint(2fPart(.5A/2^X→A

FLIP BIT X OF A (...)
A+2^I(1-2int(2fPart(.5A/2^X→A

SHIFT A LEFT BY X
2A→A

SHIFT A RIGHT BY X
int(.5A→A

NOT(A)
2^L-1-A

A AND B
sum(seq(2^Nint(2fPart(.5A/2^N))int(2fPart(.5B/2^N)),N,0,L-1

A OR B
sum(seq(2^Nmax(int(2fPart(.5A/2^N)),int(2fPart(.5B/2^N))),N,0,L-1

A XOR B
sum(seq(2^NfPart(.5int(2fPart(.5A/2^N))+.5int(2fPart(.5B/2^N)))2,N,0,L-1


Of course, if values are known, say for expressions with L, you can replace them accordingly (i.e. NOTing a 9-bit number would be equivalent to 2^9-1-A, or simply 511-A).

I am also sure that faster/smaller/simpler equivalents could be found without resorting to lists or prime products. Finding this will be left as an exercise to the reader.
JWinslow23: I wrote this piece of micro-optimized code a while ago to take the bitwise AND of two integers:

Quote:
Factoring out 2^-13 from the list helps with speed. Where E=2^-13 and L₂=2^(13-cumsum(binomcdf(31,0))), this is as fast as the matrix method:

Code:
//and (0.242 seconds)
sum(L₁round(min(fPart(AEL₂),fPart(BEL₂)),0))


The routine, as well as the ones for OR and XOR, can be sped up significantly more by just using boolean lists instead. Boolean lists are always faster than reals; the only drawback is the lack of shifting, which rarely applies.

I'm sad that my prime products idea didn't work... if only TI had made gcd( and lcm( a reasonable speed...
Wow. Thank you for that! That's actually pretty useful!

By the way, I had used some bit operations in a game I made called Bitty Bird (a 4x3 pixel Flappy Bird clone) (also, wow, just realized it's a triple-entendre), and I had used a real number as my "gameboard', as opposed to using a list, matrix, string, or worse yet, *shudders* ...using the graph as memory Razz

Quote:
...the only drawback is the lack of shifting, which rarely applies.

What are you talking about, Willis? TI-BASIC is Turing-complete! Anything computable is possible!


Code:
LEFT LOGICAL SHIFT
augment(ΔList(cumSum(L₁)),{0→L₁

LEFT ROTATE
augment(ΔList(cumSum(L₁)),{L₁(1→L₁

RIGHT LOGICAL SHIFT
dim(L₁)-1→dim(L₁
augment({0},L₁→L₁

ALTERNATE RIGHT LOGICAL SHIFT
augment({0},seq(L₁(A),A,1,dim(L₁)-1→L₁

RIGHT ROTATE
augment({L₁(dim(L₁))},seq(L₁(A),A,1,dim(L₁)-1→L₁

Those are the best ways I can think of without having a degree in mathematics Razz . It'd be interesting to have a full documentation of all of these methods and more, and their speeds and sizes.
Not of myself but RIGHT LOGICAL SHIFT is equal to

Code:
augment({0},DeltaList(cumSum(L1)-L1->L1
Quote:

Quote:
...the only drawback is the lack of shifting, which rarely applies.

What are you talking about, Willis? TI-BASIC is Turing-complete! Anything computable is possible!


Okay, lack of *efficient* shifting. This is, after all, the speed optimization thread.

PT_: That looks familiar. Maybe I wrote it, or maybe it's one of those routines that's discovered independently multiple times. In any case, I think it's faster than JWinslow's method since seq( has a couple of ms of overhead for each iteration.
You may have heard that finance and window variables are faster than real variables. This is true; they are faster both to access and to store to, because they're stored in fixed locations in memory.

All system variables take the same time to access AFAICT. However, they take different times to store to, because some things are done when some variables are stored to. For example, P/Y is automatically copied to C/Y. When Xmin, Xmax, DeltaX are stored to, another of them is updated so that Xmin+94DeltaX = Xmax. In addition, for some reason Xscl and Yscl are faster to store to than |N, by about the delay of one bcall.


Code:
Time for 1->[var]:

Variable     | Store time (ms)

Xscl, Yscl   | 2.20
|N,I%,PV,FV  | 2.30
C/Y          | 2.35
[recursiven] | 2.50
P/Y          | 2.65
X            | 2.80
Xmin         | ~7?
DeltaX       | ~6?
Xmax         | ~7?

Loop overhead has *not* been subtracted from the times. Note that Xscl and Yscl are very slightly faster than |N (but cause the graph screen to be redrawn the next time it is displayed), C/Y is very slightly slower, P/Y is slightly slower (enough to make a difference in a very tight loop) and Xmin, Xmax, DeltaX are much slower.

On the other hand, this can be taken advantage of. If for some reason you need to modify one copy of a variable while keeping another the same, store to P/Y rather than two different variables to save about 2 ms.

On the topic of non-finance variables, they of course get slower the farther back they are in the VAT.
Here are my times at the CE, measured with my own TIMER program:

Code:
[recursiven]:   1.7ms
|N:       1.5ms
I%:       1.6ms
PV:       1.7ms
PMT:      1.6ms
FV:       1.6ms
P/Y:      2.1ms
C/Y:      1.7ms
A:        1.9ms

where A already exists, as the second TIOS variable (without L1-L6) in the VAT.
Most of the time I took the average of more times testing, because it always gives me another result.

~ Done with CEmu, don't know if that makes some difference ~
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 1
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement