I've spent a lot of time refactoring the program to use static memory. I've made a huge amount of progress and all the caching and image data is now stored in user memory. Frankly the whole refactor hasn't been very painful and I think I can resolve the visual bugs without too much trouble.

I've managed memory well enough that even 16bpp images can be scaleed up to 400% rather than the previous measly 125%.

For some reason I forgot graphx limits sprites to 255x255 so I couldn't test native full screen sprites. I did test 255x191 and I'm currently getting 17fps. I'll have to split each gif frame in half just like I did way back in HDPIC v1. I expect I'll get somewhere around 13fps at full 320x240 resolution. Hopefully I can find some trick to squeeze out more fps but the code is already pretty minimal.

I forgot to bring up that switching to static memory has fixed a crash when trying to play long GIFs. I use to cache all the pointers to image data to the heap. This filled up around 150 frames and would crash immediately. With static memory it's been way more stable.

The calculator doesn't have enough archive to store the entirety of bad apple (when will CEmu support 12mb mod? 🥷 ) and there's many missing inbetween frames. But when there's enough consecutive frames it looks really good! This was limited to 10fps at half res.



I tried some optimizations on the GIF loop and improved it very slightly (mostly imperceptible). I tried adding blitting to see if it would improve the perceived smoothness of GIFs but it added way more overhead for barely an improvement. Here the white calc is blitting and the red one isn't.



Todo:
- About those missing frames. Occasionally Convimg will output an invalid file where the internal name doesn't match the file name. This is about a 1 in 40 chance. I'll have to see if an update fixes this but I may need to make an issue. Mateo already fixed this 5 days ago 😄
- I need to edit the converter to split each full-res image in half.
- I'd like to store each half of the same frame in the same appvar. It looks like Convimg supports this thankfully.
- Eventually switch to GraphY and see if it impacts performance too much. (Maybe it can make blitting worth it?)
- Update the delete function so it can properly delete GIFs.
I didn't expect to revisit this so soon, but I encountered a problem with zx0 compression. It's just too darn slow to compress a long GIF. I never noticed the speed problem with normal images since they only use 12 images that are less than 4kb each. However with each gif frame now being ~30kb each, it's taking 1 minute per frame (single-threaded), and that's only at 255x191 resolution.

Compression Tests

For context, here's some compression comparisons of two different GIFs (multi-threaded): Bad Apple and Rick Astley.

Code:
Notes:
- Bad Apple: 2,100 frames, greyscale. Tested on on a 6 core desktop Xeon CPU from 2019.
- Rick Astley: 45 frames, color. Tested on a 4 core laptop i7 CPU from 2019.

zx0:
- Bad Apple took 5 hours. Final size: 6.81mb
- Rick Astley took 20 minutes. Final size: 1.25mb

zx7:
- Bad Apple: 25 minutes. Final size: 7.39mb (8% worse than zx0)
- Rick Astley: 2 minutes. Final size: 1.28mb (2% worse than zx0)

lz4:
- Bad Apple: (untested)
- Rick Astley: 1 minute. Final size: 1.62mb (30% worse than zx0)


It's clear zx7 and lz4 have a HUGE advantage when it comes to compression times. However, this doesn't mean too much if they're slower to decompress. I went full data analysis and timed each gif frame.

Decompression Tests

lz4 doesn't have a decompress function in the toolchain so it couldn't be tested.


(First 300 frames of Bad Apple and all 45 frames of Rick Astley. Click for full size)

This just reinforced my knowledge that zx0 is king of decompression still. It averaged 17.3 fps for bad apple and 7.6 fps for Rick Astley. It averages 14% faster for bad apple and 29% faster for Rick Astley. That makes a huge difference especially at the low frame rates we're dealing with.

The FPS tracks with how much new data a frame contains. Frames with more new data are slower to decompress and display.

Conclusion

Even with zx0, I'm only getting 7.6 fps for a full color GIF which is well below my target of 10fps. Again, I'm currently only at 255x191 but I'm trying to get to 320x240 (37% more data). This means I'm going to have to take a step back and re-evaluate how GIFs work.
  1. I could reduce frame quality further. The more data that's shared among frames means more FPS.
  2. Even though Mateo doesn't recommend lz4, I'm curious what it's decompression speed is. The poor compression might still be "good enough" if the decompression speed is significantly better.
  3. I could reduce GIFs resolution back to half-res and test different methods of scaling sprites like using ScaleSprite() this time.
  4. I could reduce GIF resolution to the highest possible while still maintaining 10fps.

I'm curious which option you guys like best!
How about just skipping more frames.
I'm not skipping any frames currently. Skipping frames does technically improve fps, but it does this by sacrificing slow frames and simply not displaying them. This would just make GIFs jittery. The frames most likely to be skipped are the ones with large scene changes. If those get skipped then you get severe ghosting artifacts like the Bad Apple gif above.

On a different note, a potential solution for slow compression time is to use a different method made by emmanuel-marty. It would require someone to actually implement their method into Convimg though. I made an issue about this here: https://github.com/mateoconlechuga/convimg/issues/110
OK there's a multitude of solutions I'll be trying.
  1. Remove transparency. This improved performance by ~50%.
  2. Keep GIF frames at 160x120. This will keep (de)compression time as low as possible.
  3. I really didn't want to do this, but I don't see any other option: make my own graphx library.
My custom library would remove the flexibility that graphx provides, but will significantly improve performance. Many aspects of pictures and GIFs are known and constant. Such as, GIF frames will always be 160x120 and never need to be clipped. This means instead of doing ` gfx_ScaledSprite_NoClip(sprite,0,0,2,2); ` I could do something more like ` hdx_DoubleScaleSprite_NoClip(sprite); ` This would always double the size of a sprite and display it at 0,0.

I was avoiding this because although I've done some ARM assembly in the past, I've never touched ez80. With the help of Gemini (ChatGPT and Claude were not much help), I've created the function I mentioned above. This worked great! The new function is nearly twice as fast as the graphx function. This has gotten GIFs back up to 20fps even in full screen! Code: https://pastebin.com/pPZY0bxH

You can see what each optimization did. Original: 7.7 fps


Remove Transparency: 12.3 fps

(The magenta is where transparency would be).

Custom Library: 19.8 fps

(The magenta is where transparency would be).

So... I guess I'll be making my own library. We'll see how this goes!
Actually this wasn't that hard for the most part! I simply copied graphx and removed everything I didn't need. The library is called HDLib and gets automatically picked up by LibLoad. It's only 330 bytes.
https://github.com/TheLastMillennial/toolchain/tree/master/src%2Fhdlib

Currently the functions are still row-major which means the diagonal line is present. I'll try to adjust the functions to be column-major to avoid this.
For those who haven't seen already, I'm running a contest to squeeze every bit of performance out of my custom drawing routines: https://www.cemetech.net/forum/viewtopic.php?t=20837

Today I worked on rewriting the converter from scratch. I just used an LLM to make a basic cross-platform UI with dummy functions. Then I went in and ported all the old code to the new UI! I'm still using C# but the UI is Avalonia so it will work on Windows, Mac, & Linux. The code is also much more reliable and maintainable now!

With the new UI I've added a new setting for GIFs that lets you adjust how many bits each color channel will be limited to. This makes pixel colors more similar to each other which improves compression and playback speed.

Here's the old vs the new! (click for full size)

Are gifs now supported in the current build of the converter, or would I have to build it myself?
claculator wrote:
Are gifs now supported in the current build of the converter, or would I have to build it myself?

The code is getting more to a release point so I've built a preview you can try out. This is still alpha software so it may not be compatible with future builds.
https://1drv.ms/u/c/b27ced2546bad95f/IQAzyUdCBbO0QatzF03qrQm1ARx0o4xGCCGeFkL3az8Pfm0?e=caFhNx

One problem I'm trying to solve is the file limit on the TI-84 Plus CE. Previously I was able to fit about 10,000 files on the calculator before it used up all the RAM just to keep track of all the files in Archive. This is only really a problem when sending long gifs like Bad Apple which is over 2100 files. Each file is only 800 bytes so making a file for each frame is super inefficient. I'd like to try putting multiple frames in one appvar.

Convimg does support putting multiple frames in one appvar, but the problem is, I don't know how well a frame will compress before I create the convimg configuration file. If I assume worst-case compression (19kb per frame), one appvar could only store 3 frames. This would already reduce Bad Apple's appvar count to 700 which is a huge improvement. However, this could be improved significantly. At 800 bytes a frame, I could fit ~80 frames per appvar. This would reduce the total appvar down to just 30!

I may start with assuming worst-case compression just to get the 3x improvement. However, if there's a better way to estimate the size of each frame, I'd be interested to hear it!
Have convimg output to a flat binary for the entire output, and then use the 8xv-split feature in convbin.
Idea: Use lossy compression for gifs.
ti_kid wrote:
Idea: Use lossy compression for gifs.

Color quantization is my primary lossy method. I think I'm using the basic median cut method. Edit: I'm now using the much superior k-means clustering method.

This site explains some methods of reducing file size: https://fastmakergif.com/blog/gif-compression-techniques

I'm already doing most of the tricks: color quantization/substitution, overlay frames, and temporal compression. I can't do spacial compression since my custom function only allows one resolution, but most gifs wouldn't benefit much from it anyways.

The gif format uses lzw which is very basic compression. I've replaced that with the more efficient zx0 compression.
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
» Goto page Previous  1, 2, 3 ... 9, 10, 11
» View previous topic :: View next topic  
Page 11 of 11
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement