EDIT 2015-05-31:
The lz4 flag -Sx has been replaced by --no-frame-crc in recent versions. This post has been updated to reflect that change.
If you are still using an older version of lz4, use -Sx instead of --no-frame-crc.


This code can decompress data compressed with LZ4.
For example, I was able to compress by assembly 2048 game from 4923 bytes to 3754 bytes (the 3754 bytes includes the decompression code). Most of that is sprite data being compressed of course, but it was also able to compress some of the actual code. Likely it was compressing LCD control code, since that can be a lot of the same instructions over and over. Decompression speed is fantastic, as this algorithm was designed for speed. The majority of the program is now getting decompressed at start and it still starts up quickly.
EDIT: Another interesting tidbit, calcuzap compressed from around 14,000 bytes to just under 5000 when I compress the .8xp. Obviously that's not executable, but it'd be possible to to make a program to wrap it. You could allocate memory with the OS like normal or just copy the output code to another ram page while maintaining addresses and just hope the program doesn't use that ram page for anything Razz.

Once you have lz4 installed, you can compress your data using something like this:

Code:
lz4 -9 --no-frame-crc <sourcefile> <outfile>

Then include the resulting binary file in your program.

The --no-frame-crc flag tells it not to output stream checksums, as the decoder can't handle them. You must use this flag.

The -9 flag is to improve compression ratios slightly.

Not all data is compressible of course. LZ4 will have worse ratios than gzip, but it's primary advantage is that it's very fast and simple to decode.

This code has absolutely no error handling (handling of malformed/corrupt data). It assumes that your data is not corrupt. I could add error handling but I'm not really sure what I'd do to actually handle the error... If you'd like error handling feel free to modify the code for that. The code is also unoptimized so that it can be easier to understand, feel free to optimize it!

I think that this can support compressing up to 64K of data, but I'm not entirely sure. Realistically, if you can fit both the input and output data into the 64k address space of the z80 you're probably fine.

The code currently targets the z80 with Brass. I'll update this for ez80 once I learn it & have the time (unless someone else does it first). It should only take a few minor edits to support compressing data larger than 64K.

Also, I haven't really commented it that well, but I may go in later and comment a bit better. If you also read through these two documents, you may be able to get an understanding of it.


Please pardon my writing, as it's probably as detailed as I normally write for these sorts of things. I'm typing this at 5 AM, so I'm just trying to get this out there in a form that makes some sort of sense.


Code:
.module LZ4
;HL - Input buffer
;DE - Output buffer
DecompressLZ4Data:
;Skip header data
    ld bc,7
    add hl,bc
_decompBlocksLp:
    call _DecodeLZ4Block
    jr c,_decompBlocksLp
    ret


;Decode a block
;Returns with C if more blocks to decode, NC if end of data
_DecodeLZ4Block:
    ;Block size
    ld c,(hl) \ inc hl
    ld b,(hl) \ inc hl

    ;Return if length == 0 (EOF)
    ld a,b
    or c
    ret z

    inc hl
    ld a,(hl) \ inc hl ;If high bit == 1, uncompressed, else compressed
    jr nc,{+}
    ;Not compressed, do a data copy
    ldir
    scf
    ret
+:
    ;Compressed, run decompression
    call _DecompressLZ4Block
    scf
    ret


;HL - Input buffer
;DE - Output buffer
;BC - Block length
_DecompressLZ4Block:
    push hl
    add hl,bc
    ex (sp),hl ; Stack = address directly after data end

_decompressLp:
    ld a,(hl) \ inc hl ;Sequence token
    push af
    ;===Decompress Literals===
    ;High 4 bits -> Low 4 bits
    rra
    rra
    rra
    rra
    call _ReadByteExtensionsIfNeeded ;BC = num literals
    ;If length is 0, no copying
    ld a,b
    or c
    jr z,{+}
    ldir
    +:
    ;===
    pop af

;If we've processed the input length, return
    pop bc
    or a
    sbc hl,bc
    add hl,bc
    ret z
    push bc

    ;===Decompress Matches===
    ld c,(hl) \ inc hl
    ld b,(hl) \ inc hl
    push bc ;Store offset from output

    call _ReadByteExtensionsIfNeeded ;BC = match length
    ;Add 4 because min is 4
    inc bc \ inc bc \ inc bc \ inc bc

    ex (sp),hl   ;HL = offset from output, (SP) = input buffer
    push de
    ex de,hl     ;HL = output, DE = offset
    or a
    sbc hl,de    ;HL = match start
    pop de       ;DE = output
    ldir
    pop hl       ;HL = input buffer
    ;===
    jr _decompressLp

;If A = 15, read & add byte extensions
;Otherwise, BC = A
_ReadByteExtensionsIfNeeded:
    and Fh
    ld b,0
    ld c,a
    cp 15
    ret nz
    -:  ld a,(hl) \ inc hl
        cp 255
        jr nz,{+}
        add a,c
        jr nc,$+3
        inc b
        ld c,a
        jr {-}
    +:
    add a,c
    jr nc,$+3
    inc b
    ld c,a
    ret
.endmodule
I've never played much with compression outside RLE, but i'll read about about LZ4. Though i think someone should make it a summer (or afternoon) project to get some z80 assembly syntax highlighting into these code boxes (NanoWar did a great job with that over at RevSoft). Every ex,af' and apostrophe in a comment causes long blocks of green text Razz
Well done, Unknownloner! I find LZ4 very useful in cases when decrunching speed matters. If you're looking for best compression ratio, then better stick with APack or some other LZ7/Huffman stuff.

I had another Z80 LZ4 decompressor (made by someone from the ZX Spectrum scene), but can't find it now in the garbage pile that is my hard drive. I'll post it here if I manage to dig it up.
Yeah, speed was definitely one of the factors in picking lz4 since I'm primarily using it to decompress data when my programs start up. Slow start times are something I want to avoid if possible.

Also, if anyone is looking for alternatives to this, check out this topic for a similar compression / decompression algorithm
http://www.cemetech.net/forum/viewtopic.php?t=11292
The decompression code is considerably smaller and it yields similar compression ratios given the same data.
The compressor is somewhat slower.
I don't know if the decompression is faster, slower, or nearly equivelant.
I prefer lz4 because it's pretty easy to install with a system package manager, but take your pick based on where your priorities are.
Maybe someone can make this a DoorsCS shell extension?
I found the "other" Z80 LZ4 decompressor implementation I was talking about.
For those who are curious, here it is:
http://www.worldofspectrum.org/forums/discussion/comment/728223/
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 1
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement