In this post, tepples wrote:
"Is this Battletoads?"
Some people are fans of CHR ROM because it allows rapid switching of tiles for smooth animation of the player character. But in Kirby's Adventure, it ends up causing a lot of duplication because all frames of all enemies on screen at once need to fit in the same 2K bank of enemy tiles. So instead, I'm a fan of the Battletoads technique of loading sprite tiles into video memory as they're needed. I've already described how this works on Game Boy Advance, but the NES has far less video memory bandwidth and thus needs a bit more clever technique.
The engine I'm developing for this project has four object slots in video memory: one for the hero and three for enemies. These occupy CHR RAM $1800-$19FF, $1A00-$1BFF, $1C00-$1DFF, and $1E00-$1FFF. Each slot is divided into a pair of 16-tile buffers, plus several variables in main RAM:
On each frame that doesn't have any updates to tiles or map caused by scrolling, the sprite cel loader finds pieces of a cel to load. It prioritizes slots whose request bit is set, switching buffers and clearing the request bit if the cel is ready and loading a piece into the VRAM transfer buffer if not. Up to 8 tiles can be copied in each frame (NTSC without extended blanking). If a particular frame uses all 16 tiles, its update is split across two frames.
If there is still no scheduled VRAM transfer after the loader has processed all request bits, it loads pieces of the next cel speculatively. Speculative loading sets the next cel to the frame most likely to follow a slot's current cel, such as the next cel of a walk cycle. I count about five mispredicts per second on average, usually when an enemy spawns or when the player takes an unpredicted action, such as jumping, stopping a walk, beginning a punch combo, allowing a punch combo to expire, or taking a hit. A mispredict may delay loading a cel for a frame or two But otherwise, speculative loading puts a cel into VRAM just when it is needed, allowing the player and enemies to be animated at an acceptable frame rate.
The metasprite drawing code uses values $00-$7F normally for constant tiles. It uses $80-$8F for these switchable slots, ORing in the start tile of current buffer of the slot being drawn.
Some people are fans of CHR ROM because it allows rapid switching of tiles for smooth animation of the player character. But in Kirby's Adventure, it ends up causing a lot of duplication because all frames of all enemies on screen at once need to fit in the same 2K bank of enemy tiles. So instead, I'm a fan of the Battletoads technique of loading sprite tiles into video memory as they're needed. I've already described how this works on Game Boy Advance, but the NES has far less video memory bandwidth and thus needs a bit more clever technique.
The engine I'm developing for this project has four object slots in video memory: one for the hero and three for enemies. These occupy CHR RAM $1800-$19FF, $1A00-$1BFF, $1C00-$1DFF, and $1E00-$1FFF. Each slot is divided into a pair of 16-tile buffers, plus several variables in main RAM:
- Current cel: The cel ID currently being displayed in this slot.
- Next cel: The cel ID whose tile data needs to be loaded into the back buffer of this slot.
- Current buffer: Whether the slot's first or second buffer is its front buffer.
- Information about what data has been loaded into each buffer of each slot.
On each frame that doesn't have any updates to tiles or map caused by scrolling, the sprite cel loader finds pieces of a cel to load. It prioritizes slots whose request bit is set, switching buffers and clearing the request bit if the cel is ready and loading a piece into the VRAM transfer buffer if not. Up to 8 tiles can be copied in each frame (NTSC without extended blanking). If a particular frame uses all 16 tiles, its update is split across two frames.
If there is still no scheduled VRAM transfer after the loader has processed all request bits, it loads pieces of the next cel speculatively. Speculative loading sets the next cel to the frame most likely to follow a slot's current cel, such as the next cel of a walk cycle. I count about five mispredicts per second on average, usually when an enemy spawns or when the player takes an unpredicted action, such as jumping, stopping a walk, beginning a punch combo, allowing a punch combo to expire, or taking a hit. A mispredict may delay loading a cel for a frame or two But otherwise, speculative loading puts a cel into VRAM just when it is needed, allowing the player and enemies to be animated at an acceptable frame rate.
The metasprite drawing code uses values $00-$7F normally for constant tiles. It uses $80-$8F for these switchable slots, ORing in the start tile of current buffer of the slot being drawn.
Is the NES really that bad with sprites? I wrote down a quick loading routine and counted the cycles and ended up with:
Code:
-;
lda ({tile_address}),y //5
sta {vram_port} //4 9
iny //2 11
cpy #$10 //2 13
bne - //3 16
lda ({tile_address}),y //5
sta {vram_port} //4 9
iny //2 11
cpy #$10 //2 13
bne - //3 16
It would take only 2048 cycles to upload 8 tiles, and vblank is more than 4096 cycles long.