Emulating MMC3 - NESdev BBS

Emulating MMC3
by Disch on 2007-01-23 (#21551)

This seems to be a commonly reoccuring topic on the boards. I have a conceptually straightforward way of implimenting MMC3 IRQs which has worked out extremely well for me. I figured I would post it here for a reference for others who might be struggling with this pain-in-the-arse mapper.

Granted -- while the concept is somewhat simple -- implimenting it efficiently can (and probably will) get pretty hairy. Also, this method assumes quite a bit about how you have your emulator set up, so it's possible you won't be able to employ this method -- although maybe it might give you some ideas for a different implimentation.

---- Timestamps/Catchup ----

My emu uses a catchup method of emulation. I run the CPU ahead of everything else, then "catch up" other systems (PPU, APU, Mapper) as required. Mapper/APU triggered IRQs are all predicted ahead of time, so there isn't a need to be running Mapper/APU between every instruction.

I base my unit of time on what I call "master cycles" (some bogus term I made up)... hereon 'mc'. I've explained this quite a few times before, but:

- 1 PPU cycle = 5mc
- 1 NTSC CPU/APU cycle = 15mc
- 1 PAL CPU/APU cycle = 16mc

This allows be to keep a constant timebase (1705mc per scanline) and maintains the different PPU:CPU cycle ratios on NTSC/PAL systems.

Each frame I run for X: 446710mc (NTSC) or 531960mc (PAL). Then subtract X from each timestamp, so that timestamps stay clamped between 0-X. This allows me to look at any given timestamp and know exactly where in the frame that system is. For example, if my MMC3 timestamp is 170810... I know it's 100 scanlines into my frame, on ppu cycle 62 of the scanline.

You may be wondering about the skipped PPU cycle on the pre-render scanline on odd NTSC frames. Rather than distort my timing system to accomidate this cycle, I simply add 5mc to each relevent timestamp when I need to skip that cycle.

---- Seperating MMC3 and the PPU ----

Since my MMC3 code has its own timestamp and is run independently of the PPU, I can avoid a lot of headaches with having to be constantly running the PPU to hit the MMC3 IRQ. In fact -- my PPU emulation code has zero connections with anything MMC3. Although since the PPU state does heavily affect how/if the MMC3 IRQ counter will function, I still need to have my MMC3 code "see" some things in the PPU:

- Which pattern table the BG and sprites are using ($2000.3 and $2000.4)
- 8x8 or 8x16 sprites ($2000.5)
- PPU on or off ($2001.3 and $2001.4)

These things can be watched by having your MMC3 "hijack" writes to $2000 and $2001 so it can see when these states change. In addition to this -- the MMC3 also needs to know what sprites are found to be in-range for each scanline, so in the case of 8x16 sprites, it can see which sprites use the $1xxx pattern table and which use $0xxx (if any).

---- PPU triggered A12 rises ----

The heart of the MMC3. The PPU will cause A12 to rise when it fetches CHR from the right pattern table ($1xxx). In "normal" conditions (BG uses $0xxx, all sprites use $1xxx), this will occur 8 times per scanline (once for each sprite). However the BG could also be the culprit (if BG uses $1xxx and all sprites use $0xxx -- ?as seen in Armadillo?), in which case A12 will rise 34 times. These 42 times per scanline are key times which I call "rise points":

BG rise points: 4, 12, 20, ... , 244, 252
Sp rise points: 260, 268, ..., 308, 316
BG rise points: 324, 332

The idea behind this method of emulating the MMC3 is, when running, looking at the current MMC3 timestamp, and the timestamp we're running to -- and seeing which rise points it goes though, and how many A12 rises that triggers. Remeber that due to the constant timebase and timeframe, we can determine exactly where in the frame the MMC3 emu currently is by doing some basic math on the timestamp.

---- The previous rise ----

As previously stated "Normal" conditions will trigger 8 A12 rises. However the MMC3 IRQ counter is only clocked once! It apparently has some sort of way to filter out rises which occur in rapid succession. So that only the first rise is "valid". My emu handles this by keeping the timestamp of the last/previous rise and only clocking the IRQ counter if rises occur X or more cycles from that timestamp. Where X is the biggest gap between rise points (between 332 and 4 on the next scanline --- so that's 13 PPU cycles, or 65mc)

I also have a "quick and dirty" method of doing this during MMC3 emulation (rather than having to calculate and recalculate timestamps over and over) which will only count a rise on a rise point if the previous rise point did not rise. So if rise point 268 rise, and 260 did as well -- I would not clock the MMC3 IRQ counter (I would however, set the previous rise to that 268).

---- Running AND predicting ----

Even though there are a lot of factors in determining when an MMC3 IRQ will trigger, they can still be predicted! I've found that the easiest way to do this is to do a mock-run. That is -- "pretend" to run the MMC3 for the rest of the frame and find out on what timestamp the desired number of clocks have been reached. This way you can use the same code for emulating the MMC3 IRQs as you can for predicting them.

I have a seperate "Calculate" function which both my running and prediction code call. That keeps all the heavy calculation work out of the running/predicting and keeps it concentrated in one function.

---- Mapper 90 ----

Mapper 90's IRQs can operate in a similar fashion. The only difference is they don't mask out rapid A12 rises -- so the IRQ counter actually does get clocked 8 times in a normal scanline (mapper 90, however, has a prescaler which divides the number of clocks by 8 -- so it all works out).

That's about it. I don't know how useful you guys will find this but I was sick/bored and felt like sharing. I'd be happy to answer any questions you have about it or clarify points or whatever.