As promised: a long, boring article about Nintendo DS texture compression. This article describes the compression format. I may follow it up with another one about how my compressor works specifically.
Hardware texture compression on the NDS allows you to display very large textures for the device. You can store two compressed 1024x512 textures on a device with a total screen size of 256x384. There is no performance penalty, as the hardware can decompress them on the fly. Unfortunately, the tools to generate compressed textures are not available to the homebrew community.
Compressed textures have a slightly complicated format, and the hardware is not very forgiving about how they are stored. The texture is divided into three parts:
- Pixmap: The pixels of the texture, grouped into 4x4 blocks
- Index: A map from each pixel block to a part of the palette
- Palette: The list of colors, used in groups of 2 or 4
The pixmap can be stored in texture slots 0 or 2. The index must be stored in slot 1. This prevents textures larger than 1024x512, since the index would be in the middle of them. Palettes are stored normally.
Pixmaps are packed into 32-bit blocks of 4x4 pixels. Each pixel is two bits (valued 0-3). The top-left pixel is the least significant 2 bits, and the bottom-right pixel is the most significant 2 bits. Each 32-bit word is stored one after another, from the top-left of the image to the bottom-right. The pixmaps can be stored in texture slot 0 or slot 2.
|A bitfield as a 4x4 block|
The color each pixel value represents depends on its color index. Each index entry is a 16-bit field containing two values.
|Bits 14-15||Bits 0-13|
|Mode||Palette Index (increments of 4 bytes)|
Bits 0-13 contain an offset into the palette in increments of 4 bytes, i.e. a value of 1 is an offset of 4 bytes, a value of 2 is an offset of 8 bytes, etc. Since each color is a 16 bit word, this gives you a granularity of 2 colors. Bits 14-15 contain the mode, which specifies how the pixel values map to colors in the palette. These modes are as follows:
- Mode 0: Three colors and transparent
- Mode 1: Two colors, a 50/50 blend of those colors, and transparent
- Mode 2: Four colors
- Mode 3: Two colors, and 3/8 and 5/8 blends of those colors
This means a pixel valued 0-4 can have any of these meanings (quoting GBATEK):
|Pixel||Mode 0||Mode 1||Mode 2||Mode 3|
|0||Color 0||Color 0||Color 0||Color 0|
|1||Color 1||Color 1||Color 1||Color 1
|2||Color 2||(Color 0 + Color 1)/2||Color 2||(Color0 * 5 + Color1 * 3) / 8|
|3||Transparent||Transparent||Color 3||(Color0 * 3 + Color1 * 5) / 8|
The index must be stored in texture slot 1. The first 16-bit word affects the first 4x4 block in texture slot 0, continuing on for 64 KiB. At offset 0x10000 from slot 1, this word affects the first 4x4 block in texture slot 2, and so on until the end of slot 1. There is no flexibility at all in storing this index.
|Slot 1 Offset||Memory Contents|
|00000 - 0FFFF||Index for texture in slot 0|
|10000 - 1FFFF||Index for texture in slot 2|
The palette is stored no differently than a palette for an 8-bit paletted texture. Each color is packed in 5 bit/channel framebuffer format.
|Bit 15||Bits 10-14||Bits 5-9||Bits 0-4|
Because each block can only address two to four consecutive colors, there needs to be a lot of redundancy to get decent results. Luckily, you can use a palette up to 64 KiB in size. Picking this palette is the key to writing a good texture compressor.
Nintendo gives you a lot of room for cleverness in writing a texture compressor. If each block had its own palette, compression would be relatively straightforward. Since VRAM is too small for that, you will have to find blocks to share palettes, overlap palettes, reduce them to 2 colors, and find even more tricks.