ViTi95 wrote on 2022-02-09, 11:28:observed that the CGA mode was faster than both VGA mode 13h and mode Y. This result blows my mind, even having to do the conversion from the 256 color backbuffer to the CGA format (packed colors and even/odd scanlines), is just faster due to transferring less data over the bus.
On fast machines where ISA bus is the only bottleneck, you could improve FPS in CGA mode by updating only parts of the screen that changed between frames.
To do this you should keep previous CGA frame in RAM and when building new one from 256 color framebuffer you should compare 2,4 or 8 byte chunks for changes. If change is small, like 8 or less bits in a 32 bit string, then leave the old one. When change is big, update it (by creating list of chunks to update).
This is similar to how early lossy compression algorithms worked and should be especially effective for CGA with only 4 colors. That way you could save 15-90% of ISA bandwidth depending on scene.
Of course it would require even more work on CPU and main RAM side, but it shouldn't be a problem for 200+ MHz CPUs. Such algorithm should allow for dynamic level of detail. For example it could try to be lossless at first, but if number of updates would excess ISA bus limit, then it could start to increase lossy difference in chunks.
It should be possible to get stable full 35 fps on CGA on fast CPU, but with image artifacts. However such artifacts might even look cool.