Just a quick thought about NMI and CPU/PPU synchronisation. I know that it is best to run the CPU, but only run the PPU a few times a second, i.e. when a write is made to a PPU register. However, let's say that I am on VBlank scanline 0, PPU CC 0 and I run the CPU for about 1000 CPU cycles before I make a PPU write. Instead of having one CPU opcode and then the NMI (first cycle of the first VBlank scanline), surely the NMI's execution would be delayed unnessarily?
Maybe I'm just mad, or being really stupid, or your emulators cater for this anyway.
A lot of interrupts (e.g. PPU vblank NMI, APU DMC IRQ, APU timer IRQ, and most mapper IRQs) can be predicted well in advance. Whenever the CPU core runs ahead of other devices on the bus, it asks the interrupt sources to estimate approximately how long the CPU can run before the next interrupt occurs. For the NTSC PPU's NMI, this is (262 - scanlinesSinceVblank - 1) * 341 / 3. Then it takes the minimum of all those and runs the CPU for that long or until the next I/O write. Once the number of cycles left gets below about 16, run everything cycle-by-cycle.
why run it till it gets close to the interrupt and then 16 cycles? i thought that interrupts can be predicted exactly ?
matt
I suggested 16 cycles because as I understand it, the longest CPU instruction takes 11 cycles (7-cycle instruction + 4-cycle DPCM DMA).
tepples' "rough estimate" suggestion also has the advantage of leaving the cycle-critical stuff to the cycle-by-cycle emulation code, rather than to the prediction code. All the precise stuff is handled as it comes, rather than having to perfectly predict it in advance.
i was thinking that at one point when coding my emu, but i was trying to minimize the catch up as is causes alot of cache misses. also my emu is still work in progress, but thought that the interrupts could be determined in advance exactly.
matt