Here's my code to sort a list of 256 numbers known to be between 1 and 1000000 in 11 seconds. The range can be extended arbitrarily, but it won't be efficient if many elements are bunched up.

It's bucket sort followed by insertion sort. Still significantly slower than SortA( which takes only 3 seconds despite using the slower selection sort algorithm. That's the power of assembly—much less overhead.

**Code:** ```
//L2 is end indices for each of 256 buckets
```

//L3 is which bucket the corresponding number in L1 goes in

//L4 is copy of L1

42->rand

randInt(1,1000000,256->L1

L1->L5

SortA(L5 //sorted list in L5

startTmr->T

not(L1->L2 //all zeros

L2->L4 //initalize L4

1+int(L1*256/|E6->L3 //now all entries of L3 are integers between 1 and 256

Disp checkTmr(T

//bucket sort, 5 seconds:

For([recursiven],1,256

1+L2(L3([recursiven]->L2(L3([recursiven]

End

//L2 holds the size of each bucket

cumSum(L2->L2

//now L2 holds the end index of each bucket, so we can put the numbers in the buckets:

For([recursiven],1,256

L3([recursiven]->I%

L1([recursiven]->L4(L2(I%

L2(I%)-1->L2(I%

End

//the almost-sorted list is in L4

Disp checkTmr(T

L4->L1

//insertion sort, 4 seconds:

For([recursiven],2,dim(L4

//n-1 elements have been sorted already

//example n=4: {4 6 8 2 10 ...}

[recursiven]->I%

While L1(I%)<L1(max(1,I%-1) //prevent error at 1. Never true with I%=1

L1(I%)->PV //swap L1(I%) and L1(I%-1)

L1(I%-1->L1(I%

PV->L1(I%-1

I%-1->I%

End

End

Disp checkTmr(T

Disp min(L5=L1

By the way, [recursiven] is a variable in Seq mode that, like all system variables, is slightly faster to access than capital-letter variables, but like capital-letter variables can be used in For( loops.