In the ROM pubby originally posted in this thread (the ROM is no longer available there), there was a bug that did not occur on any emulator, but was visible on PowerPak/Everdrive (see the youtube video in that thread).
After a building a small test case and doing some testing with Visual NES, it turns out that disabling rendering during OAM evaluation has a side effect, which was causing the visual glitches in pubby's video.
Essentially, disabling rendering when OAM evaluation is on-going (e.g between cycles 65 and 256) will cause the current OAM address to be (unexpectedly) incremented by 1. In the game's specific case, rendering was disabled around cycle 200, after all visible sprites had been found and the OAM address had rolled around to about $10, incrementing by 4 every other PPU cycle (like it does when the sprite isn't in range). Disabling rendering at this point will cause the next odd PPU cycle to increment by an additional 1.
e.g:
-If disabled on an even cycle, the next cycle will increment by 5 (the normal 4, plus 1)
-If disabled on an odd cycle (e.g the address just got increment by 4 already), the next (even) cycle runs normally and the next (odd) cycle after that will increment the address by 1.
So, in this case, it causes the address to jump from, for example, $10 to $11. Since $4014 was written to without writing to $2003 first, the sprite DMA was starting at $11, meaning all the data was offset by 1 byte, causing a ton of unexpected sprites to be shown on the screen.
Something I haven't tested is what would happen when disabling OAM evaluation while a sprite that's visible on the next line is being read byte-by-byte (would it increment by 2 instead of 1 (unlikely?), or would it trigger both the +4 increment and +1 increments at once, like it does in other known scenarios).
This is a pretty minor thing that would normally go unnoticed (e.g because usually $2003 is written to right before $4014), but without emulating this behavior, the contents of OAM are shifted by a multiple of 4 (e.g because DMA writes will start at $10), which means that, while the order of sprites in OAM are wrong for a frame, it doesn't actually cause any visible issues on the screen.
Unfortunately, I don't really have a test rom to share for this one at the moment (the one I made was purely to check the behavior in Visual NES and is essentially worthless in an actual emulator). I've attached a screenshot of my observations in Visual NES (testing at various points during the scanline how the PPU reacts to a $2001 write that disables rendering). Implementing this behavior in Mesen causes glitches identical to the ones seen in the Youtube video.
After a building a small test case and doing some testing with Visual NES, it turns out that disabling rendering during OAM evaluation has a side effect, which was causing the visual glitches in pubby's video.
Essentially, disabling rendering when OAM evaluation is on-going (e.g between cycles 65 and 256) will cause the current OAM address to be (unexpectedly) incremented by 1. In the game's specific case, rendering was disabled around cycle 200, after all visible sprites had been found and the OAM address had rolled around to about $10, incrementing by 4 every other PPU cycle (like it does when the sprite isn't in range). Disabling rendering at this point will cause the next odd PPU cycle to increment by an additional 1.
e.g:
-If disabled on an even cycle, the next cycle will increment by 5 (the normal 4, plus 1)
-If disabled on an odd cycle (e.g the address just got increment by 4 already), the next (even) cycle runs normally and the next (odd) cycle after that will increment the address by 1.
So, in this case, it causes the address to jump from, for example, $10 to $11. Since $4014 was written to without writing to $2003 first, the sprite DMA was starting at $11, meaning all the data was offset by 1 byte, causing a ton of unexpected sprites to be shown on the screen.
Something I haven't tested is what would happen when disabling OAM evaluation while a sprite that's visible on the next line is being read byte-by-byte (would it increment by 2 instead of 1 (unlikely?), or would it trigger both the +4 increment and +1 increments at once, like it does in other known scenarios).
This is a pretty minor thing that would normally go unnoticed (e.g because usually $2003 is written to right before $4014), but without emulating this behavior, the contents of OAM are shifted by a multiple of 4 (e.g because DMA writes will start at $10), which means that, while the order of sprites in OAM are wrong for a frame, it doesn't actually cause any visible issues on the screen.
Unfortunately, I don't really have a test rom to share for this one at the moment (the one I made was purely to check the behavior in Visual NES and is essentially worthless in an actual emulator). I've attached a screenshot of my observations in Visual NES (testing at various points during the scanline how the PPU reacts to a $2001 write that disables rendering). Implementing this behavior in Mesen causes glitches identical to the ones seen in the Youtube video.