Hey there, it's ya gender non-specific diminutive Zeda, here, and today we'll be looking at the Fisher-Yates shuffle algorithm and just how freaking efficient it can be for shuffling a list. For reference, it takes one second to shuffle a 999-element list at 6MHz, and if that ain't the way your deity intended it, I don't know what is.

First, how do we shuffle L1 in BASIC?

Code:
``` rand(dim(L1->L2 SortA(L2,L1 ```

This is a super clever algorithm, but slow as heck as the lists get bigger. Plus, it uses an extra list of the same size, wasting precious RAM. So how does the Fisher-Yates algorithm work? You start at the last element. Randomly choose an element up to and including the current element and swap them. Now move down one element and repeat (so now the last element is off limits, then the last two, et cetera). Repeat this until there is one element left.

This is easy to perform in-place, and it performs n-1 swaps, making it significantly faster than the BASIC algorithm above. In fact, let's implement it in BASIC:

Code:
``` dim(L1->N For(K,N,2,-1 randInt(1,K->A L1(K->B L1(A->L1(K B->L1(A End ```

This takes approximately 37.5 seconds to sort a 999 element list. I don't even have the RAM needed to test the regular method, but extrapolating, it would take the "normal" method approximately 73 seconds for 999 elements. So basically, the Fisher-Yates algorithm is actually faster even in TI-BASIC (after about 400 elements, though).

So without further ado, the assembly code!

Code:
``` ;Randomizes a TI-list in Ans _RclAns= 4AD7h seed1  = \$80F8 seed2  = \$80FC seed1_0=seed1 seed1_1=seed1+2 seed2_0=seed2 seed2_1=seed2+2 #define bcall(x) rst 28h \ .dw x .db \$BB,\$6D .org \$9D95 ; Put it into 15MHz mode if possible!   in a,(2)   add a,a   sbc a,a   out (20h),a ; Initialize the random seed   ld hl,seed1   ld b,7   ld a,r _:   xor (hl)   ld (hl),a   inc hl   djnz -_   or 99   or (hl)   ld (hl),a   ; Locate Ans, verify that it is a list or complex list   bcall(_RclAns)   ex de,hl   ld c,(hl)   inc hl   ld b,(hl)   inc hl   ld (list_base),hl   dec a   jr z,+_   sub 12   ret nz   dec a _: ;A is 0 if a real list, -1 if complex ;HL points to the first element ;BC is the number of elements   and \$29     ;make it either NOP or ADD HL,HL   ld (get_complex_element),a   sub 29h   sbc a,a ;FF if real, 00 if complex   cpl   and 9   add a,9   ld (element_size),a shuffle_loop:   push bc   push bc   call rand   pop bc   ex de,hl   call mul16   dec bc   ;swap elements DE and BC     call get_element   push hl   ld d,b   ld e,c   call get_element   pop de   call swap_elements     pop bc   dec bc   ld a,c   dec a   jr nz,shuffle_loop   inc b   dec b   jr nz,shuffle_loop   ret swap_elements: ;HL and DE point to the elements element_size = \$+2   ld bc,255 _:   ld a,(de)   ldi   dec hl   ld (hl),a   inc hl   djnz -_   ret get_element: ;Input: ;   DE is the element to locate ;Output: ;   HL points to the element   ld l,e   ld h,d   add hl,hl   add hl,hl   add hl,hl   add hl,de get_complex_element:   nop list_base = \$+1   ld de,0   add hl,de   ret rand: ;Tested and passes all CAcert tests ;Uses a very simple 32-bit LCG and 32-bit LFSR ;it has a period of 18,446,744,069,414,584,320 ;roughly 18.4 quintillion. ;LFSR taps: 0,2,6,7  = 11000101 ;291cc ;Thanks to Runer112 for his help on optimizing the LCG and suggesting to try the much simpler LCG. On their own, the two are terrible, but together they are great.     ld hl,(seed1)     ld de,(seed1+2)     ld b,h     ld c,l     add hl,hl \ rl e \ rl d     add hl,hl \ rl e \ rl d     inc l     add hl,bc     ld (seed1_0),hl     ld hl,(seed1_1)     adc hl,de     ld (seed1_1),hl     ex de,hl ;;lfsr     ld hl,(seed2)     ld bc,(seed2+2)     add hl,hl \ rl c \ rl b     ld (seed2_1),bc     sbc a,a     and %11000101     xor l     ld l,a     ld (seed2_0),hl     ex de,hl     add hl,bc     ret mul16: ;BC*DE   ld hl,0   ld a,16 mul16_loop:   add hl,hl   rl e   rl d   jr nc,+_   add hl,bc   jr nc,+_   inc de _:   dec a   jr nz,mul16_loop   ret ```

It isn't perfect, but it is pretty good and importantly, it is fast! The biggest problem is in the random number generator, but even that is still pretty good for this application.

I might make a version for the ez80 calcs, but if somebody else wants to, it should be easy to adapt (and I think it'd be cool)!

Well done! I'm all for beating TI at their own game.
It really goes to show how inefficient the O(n^2) "modified selection sort" of SortA( is that the BASIC version can spend something like 98.7% of its time in calls to rand and interpreter overhead and still be twice as fast as SortA(.
squishy wrote:
Well done! I'm all for beating TI at their own game.

The bar isn't very high

Before anyone complains, yes I know these errors are because of tolerances and rounding errors.

Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

»
» All times are GMT - 5 Hours

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum