So far I've only been using the default mapper, but I want to try using MMC3. I read the relevant wiki articles but I can't get anything to work. It will compile, but the resulting rom just produces a blank screen in the emulator. The identical code works just fine with the default mapper. It's not doing anything fancy, just writing some colors to the palette to make sure it works.
So I guess what I'm looking for is a sort of "quick setup template" for getting something (anything) running using MMC3. For example:
- What do I put in the header?
- Where should my code start?
- Is there something special I need to do at the start of the reset code to get things set up? If so, does that need to happen before or after the usual initialization code (waiting to power up etc)?
Thanks!
Have you tried using a debugging emulator (FCEUX on Windows, Nintendulator, Mesen)? It should help you narrow down what's going wrong.
Guessing, the most likely reason something doesn't work when MMC3 is that your reset function isn't in the very last 8 KiB of PRG; the specific banks mapped to $8000-$DFFF aren't guaranteed on power-up.
lidnariq wrote:
Have you tried using a debugging emulator (FCEUX on Windows, Nintendulator, Mesen)? It should help you narrow down what's going wrong.
Guessing, the most likely reason something doesn't work when MMC3 is that your reset function isn't in the very last 8 KiB of PRG; the specific banks mapped to $8000-$DFFF aren't guaranteed on power-up.
I'm using FCEUX, though I'm not sure where to look to find out what's going wrong; all I can tell from that is that there
is something going wrong since it's not setting the palettes.
I put the code at $C000.
Yeah, move the code to $E000 instead. Or, insert a stub that writes 0 to $8000 (so that
$C000 is the fixed bank)
lidnariq wrote:
Yeah, move the code to $E000 instead. Or, insert a stub that writes 0 to $8000 (so that
$C000 is the fixed bank)
Moving the code to $E000 had no effect.
How big is your PRG? What happens when you open the FCEUX debugger, reset, and take the first couple steps? Does it do what you think it should?
What assembler are you using? When you say you moved the code from $C000 to $E000, did it actually move where it was in the file?
lidnariq wrote:
How big is your PRG? What happens when you open the FCEUX debugger, reset, and take the first couple steps? Does it do what you think it should?
What assembler are you using? When you say you moved the code from $C000 to $E000, did it actually move where it was in the file?
Here's my header:
Code:
.db "NES", $1a ;identification of the iNES header
.db $04 ;number of 16KB PRG-ROM pages
.db $08 ;number of 8KB CHR-ROM pages
.db $40 ;MMC3
.dsb 9, $00 ;clear the remaining bytes
When I opened the debugger and reset, the first instruction was 0101 then it was a bunch of 00s followed by a bunch of FFs followed by a bunch of 00s again in a repeating pattern like that. Seems very incorrect...
The assembler I'm using is ASM6. It's served me perfectly well with the default mapper.
Upon opening the file in a hex editor, it appears to be only $2000 in size to begin with, and everything after about $00A8 or so is just completely filled with 00s.
For that header, the file you load should be exactly 16+4*16384+8*8192=131088 bytes. Since your file is only 8 KiB, it seems safe to assume that FCEUX is loading garbage for the remainder of the file. Maybe try starting off with just your known-working NROM bit and just change the mapper number to 4 so that you can reduce the number of variables.
Since you now specify 64 KiB of PRG, you need to put the reset handler starting at 56 KiB into the file. I'm not sufficiently versed in ASM6 to know how to do that...
lidnariq wrote:
For that header, the file you load should be exactly 16+4*16384+8*8192=131088 bytes. Since your file is only 8 KiB, it seems safe to assume that FCEUX is loading garbage for the remainder of the file. Maybe try starting off with just your known-working NROM bit and just change the mapper number to 4 so that you can reduce the number of variables.
Since you now specify 64 KiB of PRG, you need to put the reset handler starting at 56 KiB into the file. I'm not sufficiently versed in ASM6 to know how to do that...
Taking a program that was working as an NROM and changing it to MMC3 and nothing else... works!
Changing the existing MMC3 program to have only 1 PRG page... works!
Thanks!
Although interestingly, even the working NROM rom is only $6000 in size and appears to have no vectors at the end indicating NMI, IRQ, and reset, which makes me wonder how that's even capable of working. The code seems to start right after the header.
$6010 bytes is a valid size for "NROM 128" boards. $10 header+$4000 PRG + $2000 CHR.
If there is only 1 bank of PRG ROM, it is loaded to $8000-bfff and mirrored at $c000-ffff.
I would expect it to work.
To answer your original question.
If you are using asm6, you need to fill banks (if there isn't code yet) with zeros using
pad and base statements.
And to initialize the MMC3, you should explicitly set every PRG and CHR bank and set the PPU mirroring.
Your init code should be in the last $2000 bytes of your PRG ROM, which will map as the fixed bank at $e000-ffff. Your vectors will also be in this bank.
dougeff wrote:
To answer your original question.
If you are using asm6, you need to fill banks (if there isn't code yet) with zeros using
pad and base statements.
And to initialize the MMC3, you should explicitly set every PRG and CHR bank and set the PPU mirroring.
Your init code should be in the last $2000 bytes of your PRG ROM, which will map as the fixed bank at $e000-ffff. Your vectors will also be in this bank.
Thanks! Although I don't really understand how to use those .base or .pad statements. When I did this:
Code:
.base $6000
.base $8000
.base $A000
.base $C000
.base $E000
The rom was $4000 in size with the reset vectors at $200A-$200F. When loaded in an emulator, it doesn't function. When I did this:
Code:
.base $6000
.pad $2000
.base $8000
.pad $2000
.base $A000
.pad $2000
.base $C000
.pad $2000
.base $E000
The compiler said "Value out of range" at all the lines corresponding to the .pad statements. It wouldn't produce a rom at all.
Through trial and error, the only thing I could get to produce a functioning rom was this:
Code:
.base $8000
Which is fine until the time comes when I'll need more memory.
For ram I use enums:
Code:
.enum $6000
SomeVar: .db 0
SomeWord: .dw 0
SomeArray: .dsb SIZE_OF_ARRAY
.ende
For an 8k bank:
Code:
.base $8000
; code/data
IF $ > $9FFF
ERROR "Bank overflow"
ENDIF
.org $A000
Repeat for however many banks you have, changing the addresses as needed.
So, let's talk about what the assembler is trying to do.
As it goes line by line, it's counting bytes. This is so, when it sees a label, it knows what address it will have, in case it needs to use it in your code.
(Default starting address = 0)
Lda #0 (it counts 2 bytes)
Label: (this label now has the value 0002)
Sta 0 (it counts 2 bytes)
Label2: (this label has the value 0004)
Jmp Label (this will translate to jump to 0002)
Since it's a 6502, there is a max of ffff for the count, if it rolls over that, error.
You can change the assembler's count with base statements.
Base $8000
Lda #0 (it counts 2 bytes)
Label: (this label now has the value 8002)
Sta 0 (it counts 2 bytes)
Label2: (this label has the value 8004)
Jmp Label (this will translate to jump to 8002)
You can use 'org' or 'pad' statements, to tell the assembler to start counting UP to a certain address. It has to be larger than (or equal to) the current count.
Base $8000
Some code here
Pad $a000
(Will fill with zero from the end of the code till reaches $a000)
Base $8000
(Resets the count back to $8000)
Pad $a000
(Fills $2000 bytes of zeros)
Etc.
I get it now. That's working.
Thanks so much for your help! That was not at all clear from the ASM6 documentation.
I've poked around with NES programming a tiny bit many years ago but this is the first time I've sat down to actually attempt to make a full game from start to finish; we'll see how far I get.
By the way. Even though MMC3 uses $2000 sized banks, there's no reason you can't program in $4000 sized chunks, as long as you set both the $8000-9fff bank and the $a000-bfff banks every time you switch PRG banks.
That would make it similar to MMC1 or UxROM boards.
MMC3 is still better than both, because of the scanline counter IRQ, which can do parallax scrolling (for example).
Or with a suitable assembler, you can all but ignore bank boundaries altogether. Say you switch bank y into window 6 ($8000) and bank y+1 into window 7 ($A000), and y can be either even or odd. Then you can access anything in ROM that doesn't cross two bank boundaries. The Curse of Possum Hollow does this with its compressed tiles and background map streams.
dougeff wrote:
By the way. Even though MMC3 uses $2000 sized banks, there's no reason you can't program in $4000 sized chunks, as long as you set both the $8000-9fff bank and the $a000-bfff banks every time you switch PRG banks.
That would make it similar to MMC1 or UxROM boards.
MMC3 is still better than both, because of the scanline counter IRQ, which can do parallax scrolling (for example).
Yeah I went with MMC3 because I'm trying to make something resembling a roguelike so I wanted four independent scrollable nametables and a status display on the bottom half or so of the screen.
That all seems to be working so far, at least on FCEUX. At first Nestopia wouldn't load it at all ("Corrupt rom") but when I used org instead of base, now it will load but not display the hud (even though FCEUX displays it perfectly)
I've got the split scroll status screen working perfectly on FCEUX, working but sometimes jittery on Nestopia and Nintendulator, but not working at all on the EverDrive (as close as I can get to real hardware currently). The game works otherwise; it scrolls and everything, but the screen never splits to show the status bar. I have the background graphics on $0000 and sprites on $1000.
What could I be missing that would cause it to work in all the emulators but not the NES?
Did you write to all the necessary MMC3 registers during startup, ensuring they have the values you want? Not just bank mapping, but also settings like the mirroring, the PRG RAM lock (important if you're working with RAM addresses above $6000), and IRQ enable/disable?
As far as I understand, everything works except your split? How are you doing it, MMC3's scanline counter? Obviously you need to tell the CPU to enable IRQ, for that, but I don't think any emulators would work if you didn't.
Sumez wrote:
Did you write to all the necessary MMC3 registers during startup, ensuring they have the values you want? Not just bank mapping, but also settings like the mirroring, the PRG RAM lock (important if you're working with RAM addresses above $6000), and IRQ enable/disable?
I did not even realize that was a thing. I thought the mirroring and such were set via the header (which itself took some figuring out to get it to work). What's "bank mapping"? After some help earlier in the thread I figured out how to get those .org statements working to define the code banks. What's "PRG RAM lock"?
The only code I have during startup to set up the irq is this:
Code:
lda #$40
sta $4017
cli
And to acknowledge it in the irq itself:
Code:
IRQ:
pha
txa
pha
tya
pha
sta $E000
bit PPUSTATUS
lda #$2A
sta PPUADDR
lda #$00
sta PPUADDR
sta PPUSCROLL
pla
tay
pla
tax
pla
rti
And the scanline to trigger the interrupt is set during nmi:
Code:
lda #$7F
sta $E000
sta $C000
sta $C001
sta $E000
sta $E001
Sumez wrote:
As far as I understand, everything works except your split? How are you doing it, MMC3's scanline counter? Obviously you need to tell the CPU to enable IRQ, for that, but I don't think any emulators would work if you didn't.
Yes, everything else works correctly, and even the splitting works in the emulators. I'm using the scanline counter.
A bit late, but here's my reply:
pinkpuff wrote:
I've got the split scroll status screen working perfectly on FCEUX
FCEUX is hardly a parameter when it comes to raster effects, it's extremely lenient with those.
Quote:
What could I be missing that would cause it to work in all the emulators but not the NES?
Hard to tell without a ROM to debug or even seeing any code.
Jittery splits usually mean your timing is a little off, so the scroll changes alternate between taking place before and after the PPU's own scroll increment. If this is the case, the solution is to tweak the timing until the scroll change takes place consistently before or after the PPU's increment.
As for the hardware issue, it sounds like the IRQ isn't firing at all. Maybe you're not setting it up correctly, and the PowerPak implementation is more demanding than the ones in the emulators you tested.
That's entirely likely. How do I set it up?
All the code relevant to the irq is above. The rom file in question is attached.
pinkpuff wrote:
I did not even realize that was a thing. I thought the mirroring and such were set via the header
The header is a convenience for emulators to know more about the hardware they're supposed to emulate, but when said hardware (e.g. mirroring) is controlled by the mapper, the header settings are meaningless.
Quote:
What's "bank mapping"?
It's the mapper registers that say which banks are mapped to each slot. They're not initialized by the mapper itself, and could be pointing anywhere on power on.
Quote:
After some help earlier in the thread I figured out how to get those .org statements working to define the code banks.
.ORGs and the like are meant to structure your ROM properly so the banks can be mapped correctly at runtime, but they don't actively do any mapping, that's a job for the game program itself.
I moved the interrupt acknowledgement to the end of the IRQ routine instead of the beginning and now it miraculously works on the EverDrive as well as all the emulators.
It's still jittery but I feel like I might be able to tackle that.
Thanks everyone for the quick replies.
If you do want to test with FCEUX, at least use the "New PPU" setting (Config > PPU). It does a much more accurate job than the default setting.
As for how to adjust the timing, the simplest way is just to add NOP ("no operation") instructions between the start of the IRQ and the point where you set the scroll. Each NOP will delay 2 cycles (6 pixels) without affecting anything else. The ideal place to alter scroll is within the horizontal blank period. (You can use a loop to make the code smaller, each NOP is 1 byte.)
In Nintendulator or FCEUX (or other debugging emulators) you can put a breakpoint on your scroll register write and see where in the scanline it occured in their debuggers. That can help find the right delay for hblank. (Always test raster effects on hardware, though.)
There's some info here about the timing of the PPU, e.g. where to find hblank:
https://wiki.nesdev.com/w/index.php/PPU_rendering
FCEUX has been set to New PPU the whole time. The status bar is still rock solid in that one.
I put in a loop to delay the scroll write until Nintendulator was reporting it happening at a tick somewhere in the low 300s (305 to about 311; it varies a bit). It was still jittery. Is that during hblank? I wasn't quite sure from the link, but I got the impression that it's basically any time after tick 257?
To clarify a little, the status bar is solid when the screen isn't moving. If you just sit there, it looks solid. When the screen scrolls though, sometimes the status bar almost "jumps to catch up" as it were (but sometimes it doesn't and looks fine). I attached the rom in case you want to see the effect.
rainwarrior wrote:
If you do want to test with FCEUX, at least use the "New PPU" setting (Config > PPU). It does a much more accurate job than the default setting.
Kinda off-topic, but does anyone know why this isn't enabled by default?
Since you have a couple of black pixels between your game window and status screen anyway, consider just putting the split one scanline earlier? Then you won't have to deal with exact timing, as you can't see if a completely black scanline is scrolled wrong. I'm pretty sure that's how most original NES games did it, too.
Edit: Oh, I was a little fast and didn't notice you're using all four nametables and scroll vertically. Nevermind my suggestion above, it obviously doesn't apply here.
Sumez wrote:
rainwarrior wrote:
If you do want to test with FCEUX, at least use the "New PPU" setting
does anyone know why this isn't enabled by default?
I think the tool-assisted speedrun community (TASVideos) kind of standardized on the old PPU to keep old runs in sync.
pinkpuff wrote:
FCEUX has been set to New PPU the whole time. The status bar is still rock solid in that one.
Ah, I am surprised by this because on the first ROM you posted I saw a flickering white artifact on the split line with New PPU but not Old PPU. (The overlay wasn't moving up or down, though, so it stable.)
pinkpuff wrote:
I put in a loop to delay the scroll write until Nintendulator was reporting it happening at a tick somewhere in the low 300s (305 to about 311; it varies a bit). It was still jittery. Is that during hblank? I wasn't quite sure from the link, but I got the impression that it's basically any time after tick 257?
I think I should have linked
Wiki: PPU Scrolling instead, it has more relevantly digested information.
So, the PPU is going to increment the Y scroll at pixel 256, and then won't start incrementing the X scroll until pixel 328. This is your window of opportunity. (The other option is to update scroll somewhere in the middle of the scanline, any time before pixel 256, leaving a rough edge and fine X scroll will be off until the next line-- that's what most games did, and it's a very wide window to hit.)
The timing of any interrupt will vary up to i think 6 cycles (18 pixels) depending on what instruction it interrupted, so try to look for the low and high point this far apart. If you've only seen 305 to 311, you might not have caught the extremes. Maybe to help testing, you have a LDA $10 / BNE wait loop, but you could temporarily add a little JSR / RTS into that loop for more variability of timing?
Also, to keep the window as tight as you can, you can do the BIT $2002 early, and preload X or Y if needed to hold values to write quickly to $2006/2005. The first write to $2006 can be done early as well, as its effect is buffered. In your case, only the second $2006 and then the $2005 need to actually occur within hblank, because they're what take immediate visual effect. The actual stuff that happens in hblank should be as minimal as possible, ideally just the write instructions, no loads.