Ironically, if the compiler supported C99, it would be even easier for it to generate correct + efficient code - it's really hard to put const on structs in C89, because anything in a braced initializer must be a compile-time constant, and the first dotted assignment implicitedly initializes the other elements. I took a lot of care putting const where I could and using nested scopes though to help the compiler out as much as possible.

Once I finish sorting out the bugs with the rendering code, I'm definitely planning to factor my data structures out into a separate library which could be incorporated into your CE libraries project.

I've actually already switched everything to *either* int or float (depending on context), with the exception of my hashing routines, which need 32 bits, but aren't in use now anyway, so shouldn't be having an effect on the performance.

The other thing it could be besides overflow might be truncation from float to int where a fractional result is being discarded to 0, but I've gone over the code pretty carefully. I also should have seen that on the PC version, so I'm skeptical (and really, I don't know what could be overflowing - 24 bits should be more than enough for products of screen coordinates).

To really boost the performance, I'm definitely going to need to switch from float to some kind of 16.8 representation, but I'm not yet sure what that's going to look like, and I always prefer to optimize for correctness before speed Wink

On _JError vs exit(int), I personally prefer to use informative error messages where possible, but it would be really nice for shells, especially those supporting C programs - but really we shouldn't treat assembly programs differently, to add a custom error handler to generate program status codes from OS errors.
I updated the latest SDK to have a couple C debugging features that you can use with CEmu. Here's the list of them here:

https://www.cemetech.net/forum/viewtopic.php?p=245150#245150

Just be sure to #include <debug.h> at the top of your program. Hope this helps! Smile
Alright, after doing a fair amount of fiddling today, I've discovered the first problem:

One of my sort functions has been inverted!
elfprince13 wrote:
Alright, after doing a fair amount of fiddling today, I've discovered the first problem:

One of my sort functions has been inverted!

Whelp, that is very odd. Hopefully it is nothing too major. Anyway, I add the radio button to switch from the console to stderr output. The console was made a lot faster too, so it shouldn't hang anymore regardless, but you never know. Good luck! Smile
So, I've narrowed it down a bit more, something really bizarre is going on, and it may be a compiler bug, because I'm seriously at a loss to explain how it's happening.


Code:
int pointerDiff(const size_t *p1, const size_t *p2){
   const ptrdiff_t delta = p1 - p2;
   dPrintf(("pd delta: %p - %p = " PDF "\n", p1, p2, delta));
   return delta ? (delta < 0 ? -1 : 1) : 0;
}

/* Elsewhere, in another function */
   dPrintf(("%p - %p = " PDF " ?= (-1 * " PDF ")\n",(void*)&data[2],(void*)&data[3],(&data[2] - &data[3]),(&data[3] - &data[2])));
   dPrintf(("\t ?= %d ?= (-1 * %d)\n",pointerDiff(&data[2], &data[3]),pointerDiff(&data[3], &data[2])));


On the PC, this outputs:

Code:
0x1002003a0 - 0x1002003a8 = -1 ?= (-1 * 1)
pd delta: 0x1002003a0 - 0x1002003a8 = -1
pd delta: 0x1002003a8 - 0x1002003a0 = 1
    ?= -1 ?= (-1 * 1)


In CEmu, this outputs:

Code:
d0323d - d03240 = -1 ?= (-1 * 1)
pd delta: d03240 - d0323d = 1
pd delta: d0323d - d03240 = -1
    ?= 1 ?= (-1 * 1)


Notice specifically the last line of each log.

I'm still parsing through the generated code here:

Code:
;   36
;   37   int pointerDiff(const size_t *p1, const size_t *p2){
_pointerDiff:
   LD   HL,-9
   CALL   __frameset
;   38      const ptrdiff_t delta = p1 - p2;
   LD   BC,(IX+9)
   LD   HL,(IX+6)
   OR   A,A
   SBC   HL,BC
   LD   BC,3
   CALL   __idivs
   LD   (IX+-3),HL
;   39      dPrintf(("pd delta: %p - %p = " PDF "\n", p1, p2, delta));
   LD   BC,HL
   PUSH   BC
   LD   BC,(IX+9)
   PUSH   BC
   LD   BC,(IX+6)
   PUSH   BC
   LD   BC,L__2
   PUSH   BC
   CALL   _dPrintfImpl
   POP   BC
   POP   BC
   POP   BC
   POP   BC
;   40      return delta ? (delta < 0 ? -1 : 1) : 0;
   LD   HL,(IX+-3)
   LD   BC,0
   OR   A,A
   SBC   HL,BC
   JR   Z,L_8
   LD   HL,(IX+-3)
   OR   A,A
   SBC   HL,BC
   JR   NC,L_4
   LD   BC,16777215
   LD   (IX+-6),BC
   JR   L_5
L_4:
   LD   BC,1
   LD   (IX+-6),BC
L_5:
   LD   BC,(IX+-6)
   LD   (IX+-9),BC
   JR   L_9
L_8:
   LD   (IX+-9),BC
L_9:
   LD   HL,(IX+-9)
;   41   }
   LD   SP,IX
   POP   IX
   RET


[edit]

I changed the implementation slightly,

Code:
int pointerDiff(const size_t *p1, const size_t *p2){
   const ptrdiff_t delta = p1 - p2;
   dPrintf(("pd delta: %p - %p = " PDF "\n", p1, p2, delta));
   if(!delta){
      return 0;
   } else if (delta > 0){
      return 1;
   } else {
      return -1;
   }
}

and now get this ASM version instead, but with the same bug.

Code:
;   36
;   37   int pointerDiff(const size_t *p1, const size_t *p2){
_pointerDiff:
   LD   HL,-3
   CALL   __frameset
;   38      const ptrdiff_t delta = p1 - p2;
   LD   BC,(IX+9)
   LD   HL,(IX+6)
   OR   A,A
   SBC   HL,BC
   LD   BC,3
   CALL   __idivs
   LD   (IX+-3),HL
;   39      dPrintf(("pd delta: %p - %p = " PDF "\n", p1, p2, delta));
   LD   BC,HL
   PUSH   BC
   LD   BC,(IX+9)
   PUSH   BC
   LD   BC,(IX+6)
   PUSH   BC
   LD   BC,L__2
   PUSH   BC
   CALL   _dPrintfImpl
   POP   BC
   POP   BC
   POP   BC
   POP   BC
;   40      if(!delta){
   LD   HL,(IX+-3)
   LD   BC,0
   OR   A,A
   SBC   HL,BC
   JR   NZ,L_5
   OR   A,A
;   41         return 0;
   SBC   HL,HL
   JR   L_6
;   42      } else if (delta > 0){
L_5:
   LD   BC,(IX+-3)
   OR   A,A
   SBC   HL,HL
   OR   A,A
   SBC   HL,BC
   JR   NC,L_3
;   43         return 1;
   LD   HL,1
   JR   L_6
;   44      } else {
L_3:
;   45         return -1;
   LD   HL,16777215
;   46      }
;   47   }
L_6:
   LD   SP,IX
   POP   IX
   RET


Anybody see what could be the problem here? I tried following it through in the emulator, but hit a snag with CEmu deciding to continue randomly when I ask it to step.

[edit 2]
I can confirm that the error is happening entirely within the pointerDiff function, and not through any interaction with return types or something else weird. I used:

Code:
int pointerDiff(const size_t *p1, const size_t *p2){
   const ptrdiff_t delta = p1 - p2;
   const int ret = delta ? (delta < 0 ? -1 : 1) : 0;
   dPrintf(("pd delta: %p - %p = " PDF " ?sign= %d\n", p1, p2, delta, ret));
   return ret;
}

to get the return value internally, and confirmed that the bug happens here.
Well... ptrdiff_t is typedef'd as an unsigned int in stddef.h, for some reason.
So obviously, the if (<0) part will never be reached.

Use int, for now, I guess, since it's 24-bit anyway (but probably you'd need much less).

But anyway, the toolchain should be updated to fix that typedef to a signed int.
Oh, hmm, let me fix that header and see what happens.

[edit]

lololololol. I'd hereby like to point out that by writing standards compliant C, the only bug in my 2200 line program when recompiled from a working version on x86_64 OS X to eZ80 TI-OS was the result of bug in a standard library header.



[edit 2]



Next step: switch to 8bpp mode for faster rendering. After that, we'll see about optimizing the math, which I'm not particularly looking forward to.
Congratulations, and nice work! The speed indeed leaves rather much to be desired, but the fact that it works at all is a great step forward. So in the end, how many compiler bugs did you end up finding? For the math, it sounds like you need a nice fast fixed-point math library, which is something I'm going to need for Graph3DE, too. Smile
Great work elfprince! Good to see it is working; even with such high-level code. (Linked lists, stacks and fast sorting? That's nice to finally be able to do on a calculator easily). Making a statically linkable fixed point library would be very handy, and could probably even be written in C and still be pretty fast. I think some of the main slowdown issues are 1), no 8bpp mode Razz and 2), Having to calculate each pixel position before plotting. Other than that; cool beans Smile

In addition, it would be nice to add some algos to the dynamic/static libraries so other people can easily implement these things. Thanks for your advice and help debugging a lot of the toolchain!
KermMartian wrote:
Congratulations, and nice work! The speed indeed leaves rather much to be desired, but the fact that it works at all is a great step forward. So in the end, how many compiler bugs did you end up finding? For the math, it sounds like you need a nice fast fixed-point math library, which is something I'm going to need for Graph3DE, too. Smile


No bugs in the generated code (besides one rather bizarre incident that we've been unable to replicate, involving malloc apparently reporting success allocating variables beyond the end of the allotted heap), but there are a number of buffer overflow issues in the compiler itself, an inexplicable error with calling a function that had a const return type, and some undiagnosed "internal errors" with macro expansion that resulted in rewrites. Plus the actual error in the standard library described in the previous post. However you can see the trail of destruction I left for the mainteners of CEmu and the SDK at these links?

As far as fixed point arithmetic goes, the Zilog compiler supports a "fract" keyword on int types to implement fixed point arithmetic, but this is highly nonstandard, and apparently has some weird semantics regarding the position of the binary point. I'll probably investigate it further before considering rolling my own, so stand by for updates on that front. I'll definitely look into graphics improvements first though.


MateoConLechuga wrote:
Great work elfprince! Good to see it is working; even with such high-level code. (Linked lists, stacks and fast sorting?

I didn't do anything special with stacks, but I did find an open-source red-black tree implementation which I spruced up a bit to use as a backing data structure for both ordered maps (i.e. dictionaries) and sets, which is super handy.

Quote:
Making a statically linkable fixed point library would be very handy, and could probably even be written in C and still be pretty fast.

See my remarks to Kerm on that score Smile

Quote:
I think some of the main slowdown issues are 1), no 8bpp mode Razz

Yeah, and we need a good optimized memset to go with 8bpp mode.

Quote:
pixel position before plotting

Say what now? Plotting horizontal chunks of polygon ("scan lines") is only limited by having to do z-fighting tests where they are co-linear (which crops up for lines, though it can probably be optimized a bit more there as well).

Quote:
In addition, it would be nice to add some algos to the dynamic/static libraries so other people can easily implement these things.

Yeah, let's have a discussion about organization and maintenance of a standard data-structures library. I need to organize my utility code and fork it off into its own repo first, but I'd definitely like to do that.


Quote:
Thanks for your advice and help debugging a lot of the toolchain!

Smile
Alright, here we go. I
  1. Improved the z-fighting heuristics to decrease the number of 1-pixel fills required.
  2. Switched to 8bpp palette rendering, using LDraw colors. This system can definitely be made more efficient than it is now since the active/complement color calculations are a bit hairy at present.

Alright, so, I implemented real rotation matrices instead of the hacky projection I was using, so that I could start implementing interactive rotation, and ran into some numerical stability issues. I cleaned them up, except for one:



n.b.:
Code:
-Wfloat-equal
is a your best friend.

[edit]

The interactive rotation is pretty nice though:
KermMartian wrote:
For the math, it sounds like you need a nice fast fixed-point math library, which is something I'm going to need for Graph3DE, too. Smile

In case you guys weren't aware, ZDS already supports this with its extension of standard C with the 'fract' data type, which "supports fixed-point fractional numbers". You can find out more about it in the documentation; there's quite a bit of information Wink Anywho, how goes continued progress? Smile
Yes, fract has been discussed in this topic Wink


At this point, I've started a rewrite of the active edge list algorithm, because there were a number of numerical instabilities related to sub-pixel rendering accuracy. The new version will only support convex polygons instead of arbitary polygons, but should be much more numerical stable.
KermMartian wrote:
For the math, it sounds like you need a nice fast fixed-point math library, which is something I'm going to need for Graph3DE, too. Smile


I have some assembly fixed-point routines... including matrix multiplication specifically for 3D rotation...

* HactarCE hides
How goes progress on this? Smile You may be interested in the new memset_fast function defined in the tice.h header, which performs quite substantially faster than the memset present in the bootcode.
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 2 of 2
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement