The mysterious CEL flags that disable loading elements
There are a series of CEL flags with the LD prefix that I hadn't properly used before. Except one that seemed to be the only one working, the LDPLUT. In fact, I phrased it wrong. When you use CreateCel to make a new CCB, all these four flags, LDSIZE, LDPRS, LDPIXC and LDPLUT are enabled by default. It is when you decide to disable them that things go awry!
I have (ab)used LDPLUT in OptiDoom once (although I may have removed this trick in recent versions as some refactoring for speed made it not practical to use it anymore) when a wall segment will send individual CELs (in a linked list) for each column of a wall. So if a single wall is covering the whole screen, 280 CCBs will be passed (in reality Doom has a bigger array fed with various different elements that is only send and flushed when it reaches the max). Since a wall uses a palletized texture that is the same for every column, why every CCB has to be fed with the palette pointer and the hardware has to reload the same palette again and again? Needless to say, this was a hack proving the concept of "premature optimization is the root of all evil" as the CPU bottleneck compared to this small optimization is so big, that there wouldn't be any difference. But I was just experimenting with whatever I had in mind at the time. And I learned about these LD flags a bit.
So, these flags are telling the CEL hardware to load things from a CCB or skip it if the flags are absent. And it works like this. If I send a CEL with a pointer to the palette, the palette will be read from memory and change the hardware state to use this palette. There might be certain palette registers in the hardware for all I know, whose state has now changed. But for every individual CEL you send, this palette has to be reloaded from the palette pointer (ccb_PLUTPtr) in the CEL. The idea is that you could reset this flag for all the CCBs except the first one. So the first one will load the palette in the hardware, the rest will not provide the same palette. This saves a lot of unnecessary uploading to the hardware.
That worked with a test of 1024 sprites of size 16*16*8bpp. The 8bpp palettized mode will really use 5bits for palette index (32 colors) and the rest of the bits will affect other things (will still affect colors but by adding shade to the existing color, I've not experimented with this one to be honest). With LDPLUT enable (force loading of each individual palette) I got 35fps. With it disabled I got 38fps. Slight improvement but good for a simple trick. I even think the bottleneck is reading the texture here. 1024* 16*16 = 256k while 1024*32*2(16bit palette) = 64k. I could try later with smaller sprites or 4bpp data.
But this works. However I didn't have the same luck with the rest of the flags. I would get all my sprites dissapear and/or extremely low frame rate depending on where I am running it (different emulators, real machine). Yet I guessed and found what should you do in case you disable the other three LDSIZE, LDPRS, LDPIXC flags. You should truncate the CCB struct! And I did it by hand as I haven't seen a suitable function on Lib3DO to do this.
First of all what are these flags doing (when resetting them)?
- LDSIZE: If not set, it will avoid loading the scale vectors HDX, HDY, VDX and VDY. You don't need to load them for several sprites that do not have any scale or rotation. But the values already in the hardware state will be used, so you have to send a single sprite (your 1st in the list) with this flag enabled and those four vectors in their default values for 1x1 scale.
- LDPRS: If not set, it will not load the 3rd vector HDDX, HDDY. Whatever was in the hardware state before will be reused. These are only necessary if you are rendering arbitrary polygons. But if you have a big list of sprites, either scaled/rotated or unscaled, then these should always by set to zero. The first sprite with LDPRS enabled will pass them as zero, and the rest of the sprites on the list can simply disable the flag.
- LDPIXC: This one affects the shading/blending capabilities of the CEL. Why reloading it again and again (with default value 0x1F001F00 for opaque) if you have a thousand sprites where no shading or blending is applied? Or better say, with the same shading/blending applied. If you have a thousand sprite that all blend with the background using the exact same PIXC value, you could enable the flag for the 1st one and disable it for the rest, so that PIXC is loaded in the hardware once and then all the rest of the sprites will use the same value in the hardware, no need to reupload.