Login [Register]
Don't have an account? Register now to chat, post, use our tools, and much more.
People have mentioned here and there that it'd be neat to display jpegs on the TI84+CSE. Honestly I think trying to support the whole jpeg format, or even a subset, is too much effort, so I've been looking into making my own image encoder and a decoder to go with it.

I've got an encoder and decoder working on computer. Right now I haven't really optimized it for decoding on calc yet, just working on getting it to work properly. It encodes images in the RGB color space, rather than the YCrCb (which is apparently optimal for this kind of thing, I might look into that later), and it stores 8-8-8 color values.

The encoder follows the basic process of jpeg:
Input data -> Discrete Cosine Transform -> Quantization -> RLE encoding -> Huffman encoding.

I'll post some more info, maybe an executable, once I've done some more work on it. I'm going to see if I can improve the quality to file size ratio tomorrow I think.

Here's a few pictures w/file sizes, all 320x240 pixels. Q goes from 0 to 100 and controls quality. Aside from the few sample images below, a full album with a few higher quality encodings (up to 100) can be found here as well http://imgur.com/a/B4zQA

EDIT:
Right now I'm using the exact same quantization tables as libjpeg. If anyone knows of some better tables / better scaling techniques for them, let me know Smile

EDIT2: New code is now on GitHub
https://github.com/unknownloner/ImageCompress

Source


Q = 0, 11065 bytes


Q = 10, 24227 bytes


Q = 15, 30674 bytes


Q = 25, 40560 bytes
Very impressive! Nice work on this; looks to me that even at a Q of 25, it is very near the original image except for some slight edge pixelization. At 100 Q, it is just like the original! I know that I have stated the obvious, but great work nonetheless! As for your question, it may not be what you want, or even closely related to what you are asking, but libPNG has a nice compression ratio.
MateoConLechuga wrote:
As for your question [...] libPNG has a nice compression ratio.

Nah, what I'm looking for has nothing to do with PNG
Ah, okay. Knew I was way off. Smile
Very cool stuff, UnknownLoner, and definitely something that would be a boon to TI-84+CSE programmers and users. I feel that your Q value is a bit unintuitive at the moment, because it looks like a Q of 25 with your algorithm is equivalent to a Q of about 90 with JPEG encoding. Yes, YCrCb is better for JPEG, because you can remove more information with less visual compromise as far as the human eye can tell. To quote a StackExchange commenter:
ysap wrote:
the eye is more sensitive to changes in luminance (Y, brightness) than to changes in chroma (Cb, Cr, color). Thus, it is possible to erase some chroma information while retaining image quality.
In other words, it's better to have some of your colors be slightly wrong than to have blocks where the pixels are the wrong luminance/brightness. Good luck on this; I can't wait to see how it progresses. Since you haven't written the on-calc ASM decoder yet, you probably have no idea how big that routine will be, correct?
Right, I have no idea how big the routine will be. Now that I have something actually working my focus is going to be on improving the format and finding the best format for easy on-calc display, as well as making the encoder more effecient because right now it's really slow, and it doesn't need to be.

Also, I agree the Q value is entirely unintuitive, but you can blame libjpeg for that. It's based on this section code: http://hastebin.com/iyalehagab.cpp (see jpeg_set_quality at the bottom for the entry point).
The reason a lower Q value looks better with my algorithm might be a result of a few things
1. Right now I'm encoding as RGB instead of YCrCb, could affect it somehow
2. Because I'm not encoding in YCrCb, I'm not doing (and can't do) any chroma subsampling
3. Low resolution, differences might be a little harder to see

It might be worth pointing out though that saying "Q of about 90 with JPEG encoding" is meaningless without knowing the program which is actually encoding the jpeg, because every jpeg encoder has their own quantization tables and methods for scaling them based on the value of Q. That said, here's the same image encoded with "Q = 25" with GIMP. If I had photoshop I'd post an encoding with "Q = 25" in photoshop, and it'd probably look noticeably better (as well as being a larger file).


Other possibilities for quality settings would be to allow the user to specify a target file size, or provide their own quantization matrix. The target file size option seems like the more usable of the two options. I can also create a GUI that lets you mess with the settings interactively if you want to just scroll through qualities until you find one you like.
Hmm, my intuition on how that image would look at low JPEG quality was all wrong, then. I think that targeting a specific size rather than a somewhat-arbitrary quality factor makes the most sense for calculator programming, where the programmer probably has a maximum allowable size for art assets in mind.
I've done a complete re-write of the encoder, based on the old one, but this time I've written it in scala. This has the interesting advantage that I was able to trivially multi-thread the entire encoding process up to but not including the huffman compression (which is already incredibly fast anyway). This, combined with some use of LUTs, has made the encoder very fast, fast enough that I was able to write the function mentioned earlier to target a specific size. The code gurantees that, while the result may not be exactly the target size, it will never be larger.

To give some hard numbers to it, all the images below were encoded on my machine in about 1.5 seconds total (which includes multiple re-encodes to reach the target file size). I could speed this up a bit more if I reduce the accuracy when reaching the target file size, and that could be a command line flag.

I've also switched to converting RGB to YCbCr before the compression, and I've changed my RLE encoding to only encode runs of zeroes, which ends up giving better compression.

All this has resulted in a lot more bang for your byte in terms of quality, so heres a few pics (same source image as before for a fair comparison). Again, these are all 320x240 resolution (full calc resolution)

8K: (looks better than the old Q=0 @ 11K)


12K:


16K (1 page):


24K:


32K (2 pages):
Very impressive results! Smile So your RLE encoding only encodes runs of zeroes? I am wondering how this is able to compress better; of course, I really have no idea the awesomeness of what I am seeing. Smile
There's rarely a run of non-zero values, and I'm encoding 16 bit values, so if I only encode runs of zeroes then a run takes 3 bytes (0, 0, count) and all other values take 2 bytes (msb, lsb). Also makes it more simpler to decode. If I were to run-length encode all numbers a run would take 4 bytes (run-signal, msb, lsb, count)
Ah, that makes sense then! Thanks for the explanation. Smile Wish I could put something more interesting here to say, but oh well. Smile Nice work!
Code now up on github if anyone is interested. https://github.com/unknownloner/ImageCompress
Figured I'd dump my brainstorming for an on-calc decoder implementation. This has quite a few details left out (such as the specifics of performing the inverse DCT and converting back to RGB), and may be a little uncollected since it's a bit stream-of-thought. Somebody may find it interesting though, or spot some flaws that should be corrected (or areas for improvement).

The following has been copy-pasted from the comments in my code file (which currently lacks any actual code)

Planning:
Image data goes through the following process to get encoded:
1. Read image data to 8x8 blocks of pixels
2. For each block:
- Convert from RGB to YCbCr
- Apply Discrete Cosine Transform
- RLE encode block, with YCbCr channels interleaved
3. Huffman encode the the output RLE data

When stored to a file, the data has the following format:
64 bytes: Luminance quantization table
64 bytes: Chrominance quantization table
1 byte: Amount of 8x8 blocks across the X axis
2 bytes: Size of huffman tree
N bytes: huffman tree
3 bytes: Size of data when decompressed. (May be irrelevant, if so remove from format)
? bytes: huffman data


Therefore to display the image we should do the following:
1. Set the LCD windows Y values to cover the screen vertically. They will stay as such for the rest of the decoding process
2. Read luminance and chrominance table, store in fixed spot in memory
3. Read number of 8x8 blocks, store in memory. This will be used for wrapping to the next line later
4. Copy the huffman tree to a fixed location in memory
5. Display blocks

Displaying a block has the following requirements (starting from the end result and working backwards):
- We require the RGB data of each pixel in 5-6-5 format
- This RGB data must be converted from the YCbCr data
- To get the YCbCr data, we need to apply the inverse discrete cosine transform to the pixel data
- To apply the inverse DCT, we need to read the DCT values for every pixel in the 8x8 image block
- To read a DCT value, we need to read a value from the RLE decompression stream
- The RLE decompression stream should return a value and decrement a counter if in the middle of a run, or read a new value from the huffman stream

Reading from the huffman stream requires walking the tree node by node. This means treating the data as a bit stream.
Each tree node has the following format:
- Leaf or Branch (1 = Leaf, 0 = Branch), 1 byte
+Branch:
- 2 bytes = size of left node's data (offset to right node's data)
- Left node data
- Right node data
+Leaf:
- Value, 1 byte

A basic bit stream is simple:
- 1 byte stores current byte which bits are being read from. It's shifted left to read the next bit, shifting it into the carry flag
- Another byte stores a counter which counts down to 0. Once 0, it's reset to 8 and another byte is read from the data
++ This has a special case to it. For images larger than 16384 bytes, the bit stream needs to be aware
of memory pages. It should automatically map in the next page of image data in when it reaches
the end of the current page. I'm unsure on the specifics of this, and it may be one of the last things I
implement, since the other parts of the code don't need to be aware of it happening.

This is a pretty abstract overview of what needs to be done to decode an image,
but it may help to follow it when writing the first working version,
and then combine steps later to optimize.
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 1
» All times are GMT - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement