I looked at DCS's iFastCopy routine in Calcsys, and I think it could be made more optimized. Here's my code:
Code:
For one thing, there are no pushes or pops. For another thing, the data is outputted as soon as possible. And outi is faster than loading (hl) into a and outputting a to port $11.
Code:
di
fastloop1:
in a,($10)
rla
jr c,fastloop1
ld a,$80
out ($10),a
ld hl,plotsscreen-11
ld de,11
ld c,$11
ld a,$20
fastloop2:
ex af,af'
fastloop3:
in a,($10)
rla
jr c,fastloop3
ex af,af'
out ($10),a
ex af,af'
ld b,64
fastloop4:
add hl,de
fastloop5:
in a,($10)
rla
jr c,fastloop5
outi
jr nz,fastloop4
ex af,af'
dec h
dec h
dec h
inc l
inc a
cp $2C
jr nz,fastloop2
ei
ret