Superclipping. Needs more research..

 I was recently watching the progress with OpenLara on the 3DO and several people trying the alpha demo discovered sudden rare hickups/locks in the rendering. The original developer realized this was coming from the use of superclipping feature and that reminded me my recent discovery that feedback CEL textures with LRFORM flag will lock if superclipping on them is enabled while they cross certain edges of the screen. However, the developer of the port claims that the textures are regular 4bit palletized and not the 16bit framebuffer format. And the hick ups are rare there, if they were LRFORM then almost every frame would be locked to 1FPS. There is something wrong happening with the rendering of few CELs, as far as I know invalid CELs or other bugs can lock the rendering forever, and there is a timer value of exactly 1 second that can also be controlled from a function (but we don't need to), where if something went truly wrong and a DrawCel doesn't return at that time, the rendering terminates.

I might just need to do some tests later with 4bpp palettized textures and a bunch of them at various shapes in the screen, I don't know. It's strange that it happens in the case of OpenLara too, as in my experience it's a bug with framebuffer textures, but never experienced it with other linear types of textures personally, yet I haven't build a more complex 3d engine that could trigger this, most of my experiments are pretty basic.

Still, I did some other kinds of interesting experiments, like hacking my old feedback cube to move around and demonstrate the bug. Or zooming two pictures, neither of them points to the framebuffer, but one of them is stored in the zig-zag format of the framebuffer (but I assume it's loaded in regular ram, although I haven't checked of LoadCel does something weird) yet still triggers the bug. I used a tool called BMPto3DOCEL which the author made it read a BMP and extract a CEL file, but there are not enough options and it always saves a 16bit CEL in framebuffer format. I did hack that tool to produce a linear format, as I thought the framebuffer format shouldn't be used (or at least doesn't make sense) for regular textures, only when you point to a backbuffer in vram to do feedback effects. But there is something interesting I also noticed about performance.


The orange troll picture is stored in linear 16bit format. The white cat picture is 16bit videoram structure format (LRFORM) but not pointing at a backbuffer, but just the data loaded from a cel file. The original BMPto3DOCEL would store in that format, instead of the linear format I prefer with my hacked BMPto3DOCELfixed (I'll post these tools somewhere or create my own, but this hacked version is available in our discord in Graphics subsection as a pinned message for now).

I demonstrate how the white cat texture with superclipping enable will lock when it reaches the edge of the screen (that's also variable, sometimes depending on how it rotates or which of the 4 corners, it doesn't, this is totally related to the superclipping direction that depending on the bitmap vectors and the edge of the screen will decide to terminate a bitmap line or not).

Later on I also show what it means for performance to not have super-clipping if your sprites zoom into an enormous size that extends a lot outside the screen. From 7-12fps to over 100fps. In my tests I disable vsync so mind the flicker. But here is the other interesting thing. I realized that at the same zoom level (like the reset values in the beginning, center of the screen and no zoom or rotate) the LRFORM zigzag vram structure image happens to be faster (250-270fps) than the linear 16bit bitmap (200fps). They both have exactly the same size of data, it's just that one of them follows the videoram structure.  That was unexpected!

You might say (and maybe that's a thing the creator of BMPto3DOCEL thought) that because you write in the videoram, maybe reading the texture in a videoram structure format would be faster. But I can't think how that would be the case. You are going to read the bitmap and then scale/rotate it or interpolate anyway,. it's not a direct copy from vram structure to vram. I also tested that with rotate image or zoomed at the same level, in case a 1x1 sprite is copied differently in the hardware without filling the texels. Same results. I thought a linear bitmap would be the same. I need to investigate this more. I am not even using the same picture. I thought at some point that what if there are transparent pixels in the cat and are skipped so it appears faster. But I think it's a block of opaque pixels, and dracul is the same. There are dark pixels but they are not black. Next I will retry with the exact same bitmap, stored in the two different formats.

3DO hardware quirks are mysterious. There are still lot's of stuff to learn. And lots of CDs to burn..

Comments