Is there any good information on how mappers work in the hardware sense? I'm aware of their use in software but I've never really seen much about how they really function in hardware. About all I know is that with certain 74 series logic chips you can connect them to the NES CPU bus and a ROM chips to latch bits to effectively latch upper address lines on the ROM and things like that. So that's not alot and I'd like to understand more. Particularly how you can have different banks within address space, such as having four 8k PRG banks, eight 1kb CHR banks, etc.
So if anyone can point out some useful information I'd appreciate it.
To have different banks in a window you first start with registers. The register outputs are multiplexed to the high ROM address lines. The multiplexer selects are the CPU address lines. When the CPU accesses that window of memory, the multiplexer will switch the address lines to it's associated register. Here's an example:
This mapper has 4 regs at $8000-9FFF, $A000-BFFF, $C000-DFFF, $E000-FFFF. Write the bank you want to appear at that 8K window. Now you know 80% of all mapper logic ;)
The process is basically what you described. The function of a mapper is to make certain pieces of a memory chip appear at certain ranges of the addressable memory of the CPU or PPU.
The simplest mappers simply have a 74161 feed the highest address lines of the memory chip in question (PRG or CHR) with values it was fed with the last time there was an attempt to write a byte to ROM. The NES indicates that a value is to be written through the "PRG R/W" line, and sice there is no point in writing to ROM, that write signal is sent to the 161, which is also connected to some of the data lines, so it can receive the data it will later latch to the memory chip.
More complex mappers, that break the addressable space in smaller pieces and have multiple registers do exactly the same thing, but they also take into consideration the address lines, meaning it will interpret writes differently depending on the address that was written to (multiple registers) and retrieve data from different parts of the memory chip depending on the address being read (banks smaller than the total addressable space).
A CNROM cart for example, only has a 74161 as a mapper, so all it can do is select a 8KB CHR-ROM page from a larger chip to occupy the whole 8KB of patterns the NES can address. A slightly more complex mapper, UNROM, in addition to the 161 has a 7432 which ORs the highest bit of the read address against all 4 bits latched from the 161 before sending them to the PRG-ROM chip, so that whenever the upper half of the addressable space is accessed (the highest bit will be 1 in this case) the latched bits will all become 1, causing the last bank of the chip (1111 binary, 15 decimal) to be seen at the upper half of the addressable space at all times.
Mappers such as the MMC3 work in a similar way, but the banks are so small and there are so many registers that it would require too many off the shelf chips, so it's implemented as a custom chip.
Ok, kyuusaku. I get the idea that you can watch A13 and A14 to determine the "window" being 8000, A000, C000, or E000. So that part makes sense now in a loose sort of way.
So some chip or circuit will watch the requested address (which I never thought about before), and based on A13 and A14, a register is used to set the ROM chip's address lines (which I knew before). I'm guessing you could address a WRAM chip by watching A15 along with A13 and A14?
But anyway, I get the basic idea, but what sort of chips and things would you actually put into practice to achieve this?
Tokumaru, I hadn't heard about this PRG R/W connection before, but that certainly is useful. I imagine you would use it to disable ROM to avoid bus conflicts. I do get what you mean by a mapper as complex as MMC3 would be too impractical to make using generic chips. But it would be possible to do so right?
I'm guessing that you would likely use a FPGA type of device if you wanted an advanced custom mapper since it probably isn't feasible to have a mask ASIC produced like Nintendo would have in the old days?
Part of the reason I've been curious about this is because I think it would be cool to have a new mapper available for homebrew or repro building that is a bit better than the MMC3. Ideally 4x 8Kb PRG, 8x 1Kb CHR, 8Kb WRAM or SRAM, and either a PPU or CPU based IRQ counter. It seems to me that it wouldn't be too difficult to do something like that. I would guess that a CPU cycle based counter would be easier to create than a PPU based one. I wonder if Bunnyboy or anyone else has thought about producing such a mapper for homebrew projects. As it is now, if you had a homebrew project that needs IRQ or smaller bank sizes you have to butcher a commercial game cartridge.
I'm guessing creating your own board with generic chips to accomplish this would be expensive and take up alot of space.
MottZilla wrote:
I'm guessing that you would likely use a FPGA type of device if you wanted an advanced custom mapper since it probably isn't feasible to have a mask ASIC produced like Nintendo would have in the old days?
PowerPak puts a mapper on an FPGA. It's also very expensive. For this reason, I don't see how an FPGA would be cost effective for replicating your work in cartridges. A CPLD, the next cheaper part than an FPGA, is big enough to hold an MMC1 but apparently not an MMC3. But I'd love to be proven wrong.
Once I sketched a mapper with MMC3-style PRG bank switching based around a pair of 74HC670s. You might want to read
that topic.
Is the MMC1 Repro then a board with a CPLD programmed to act as a MMC1 then? Strange question then, if one CPLD is affordable, but can't handle the complexity of the MMC3, why not use two or more CPLDs? What if you split the PRG and CHR onto two different CPLDs working in collaboration together?
The most common (legacy, very available, 5V friendly and easy to program) high capacity PLCC CPLD are the XC95108 from Xilinx and the EPM7128 from Altera, and both can hold a MMC3 with bit of room to spare. The most common Xilinx surface mount CPLD is the XC95144, which can even hold a full blown FME-7.
With CPLDs, your logic is optimized so tightly there is barely any overhead/macrocells used as interconnects, pretty much all you have to take into account are register bits which each occupy a macrocell.
I counted register bits for some mappers:
CNROM 2
UNROM 4
GNROM 4
AOROM 4
MMC1 27
MMC2 21
MMC3 80
MMC4 27
FME7 (full) 113
Note though that if you tailor your FME7 implementation to each commercial game, you'll find they each use few enough bits to fit into a XC95108.
Almost all other 3rd party FC mappers will fit into a XC95108 as well. When they don't, it's because they simultaneously use:
PRG - 3x5/6 (256k/512k)
CHR - 8x8 (256k)
IRQ - 16 + typically 2 control (A12/Phi2 counter both)
mirroring - 1 or 2
--------------------
= 100 + a couple misc macrocells (will barely squeeze in)
but throw in configuration or index registers and it's over the limit.
Well that sounds good then, it would be sweet to see CPLD's to act as some nicer mappers. I'm guessing if you can do FME7, VRC4 should be doable too.
MottZilla wrote:
Tokumaru, I hadn't heard about this PRG R/W connection before, but that certainly is useful. I imagine you would use it to disable ROM to avoid bus conflicts.
I believe some mappers do that, yes. But most discrete logic mappers don't, so that's why when you write a mapper command, you have to do it to a location that contains the same value.
Quote:
I do get what you mean by a mapper as complex as MMC3 would be too impractical to make using generic chips. But it would be possible to do so right?
Theoretically yes, because it is possible to do it with the available parts, but I'm pretty sure you'd have power supply problems if you tried to power so many chips at once.
Quote:
Part of the reason I've been curious about this is because I think it would be cool to have a new mapper available for homebrew or repro building that is a bit better than the MMC3. Ideally 4x 8Kb PRG, 8x 1Kb CHR, 8Kb WRAM or SRAM, and either a PPU or CPU based IRQ counter.
I've been desiring this mapper for a while. What you described seems ideal to me, my only requirements is that it doesn't take anything away from the programmer, like the MMC3 does with its scanline counter.
Quote:
It seems to me that it wouldn't be too difficult to do something like that. I would guess that a CPU cycle based counter would be easier to create than a PPU based one.
Some people don't like CPU based counters because code has to be adjusted to work in different regions (PAL x NTSC). I don't think this is a big problem, as there are probably other things that will require changing anyway (the music engine, for example).
tokumaru wrote:
Theoretically yes, because it is possible to do it with the available parts, but I'm pretty sure you'd have power supply problems if you tried to power so many chips at once.
It's not so much a power issue as it is a board space issue. The mapper you would like could take less than 15 chips since it maps really well to 74 series. CMOS parts can be used which are very low power and will talk fine with everything, 50 chips I don't think would be a problem.
If people would prefer discrete chips to a CPLD, it may cost more and not be reconfigurable, but it would still work and fit inside a NES case (or oversized FC) without compromise.
Quote:
If people would prefer discrete chips to a CPLD, it may cost more and not be reconfigurable, but it would still work and fit inside a NES case (or oversized FC) without compromise.
You can probably not put 15 chips in a NES case, and I'd be surprised if you could simulate a MMC3 with "only" 15 74 chips. A MMC1 would take about 12 if I remember well, so a MMC3 would be probably more than 20. Anyway you can not put more than about 6 chips inside the NES case, it's really not that big. Altough for the FC this is unlimited, but a cartridge that would be 1m long wouldn't be practical.
kyuusaku wrote:
If people would prefer discrete chips to a CPLD, it may cost more and not be reconfigurable, but it would still work and fit inside a NES case (or oversized FC) without compromise.
I wouldn't care whether it is a CPLD or discrete chips, so long as the end result would be an affordable mapper comparable to FME-7 (4x 8K PRG, 8x 1K CHR, CPU IRQ Counter, WRAM). It would help homebrewers and would help me when I'm busy trying to hack some game from one mapper to another. We really are limited to FME-7 in Batman and MMC-5 which isn't very good since both are limited in supply. And as mentioned MMC3 can't always cut it.
Bregalad wrote:
Quote:
If people would prefer discrete chips to a CPLD, it may cost more and not be reconfigurable, but it would still work and fit inside a NES case (or oversized FC) without compromise.
You can probably not put 15 chips in a NES case, and I'd be surprised if you could simulate a MMC3 with "only" 15 74 chips. A MMC1 would take about 12 if I remember well, so a MMC3 would be probably more than 20. Anyway you can not put more than about 6 chips inside the NES case, it's really not that big. Altough for the FC this is unlimited, but a cartridge that would be 1m long wouldn't be practical.
If a PCB was made that filled the entire plastic enclosure you could easily fit 15 chips. Putting chips on the back side would allow even more. If surface mount chips were used you could double that figure in addition to 2 ROMs. There is space I believe even for full DIP chips with WRAM, CHR RAM, 4-screen options...
And no the MMC3 can't easily fit into 15 chips but a FME-7 can approximately. (Perhaps 17 because I forgot the counters are 4-bit.)
Recall this thread:
http://nesdev.com/bbs/viewtopi ... c&start=15
That is a scaled down FME-7 in PRG and WRAM function. Restoring PRG capacity means one more 670 chip. Restoring WRAM would mean remapping $e000's bank to $6000 (with a single AND3 gate!) and decoding the high PRG address bit since we don't need to address 2MiB of PRG ROM. Extending the index (command) register from 4-bits to 8-bits will also give WRAM protection bits and mirroring bits etc. That makes 13 logic chips plus what it takes to replace the PAL--likely some random logic gates for the mirroring MUX and a number of address decoders. It will fit!
I'm almost ready to start designing a discrete version to prove it ;) Technically it's even better than the FME-7 since you can map WRAM to the other CPU banks. (Think WRAM for DPCM samples etc.)
Edit: not good, 5V CPLD are really expensive all the sudden and 3.3V aren't sure to work. I think discrete chips would actually be cheaper unless old stock CPLD can be found in bulk o_0
What do you mean technically better?
Would it be fully compatible with existing FME-7 games? Or what differences would it have? Personally I'd like to see a version that is exactly like what is in Batman. It would be neat to see a discrete version design as it would sort of immortalize it wouldn't it.
Whatever you come up with I'd love to see it. I agree it wouldn't be a problem fitting 13, 15 or so chips in a NES cart. Afterall that pirate cart fit 13 into a Famicom cart.
If WRAM was mappable to all banks using the method I described, games would probably still work, it's unlikely they'd set unused bits. Basically the less you deviate from the FME-7 design, the more random logic you'll need. The more random logic, the less integration efficiency and the more you'd want to use a PAL, but then that kind of defeats the purpose of having a discrete mapper -- something you can just build that doesn't need programming. I think a FME-7 compatible design would fit in 20 chips still costing less than a 5V CPLD at the moment.
Well, you could make the design for both couldn't you? Make one that is more costly/more chips but it FME7 100% clone, and another that is FME7 compatible to a point and cheaper/less chips? It would be nice to have the designs both available so anyone could just put together their own FME7 mapper for their purposes, whether they want a 100% compatible or less.
I'm guessing that you would be making the PRG registers for each section the same as the $6000 section where they are 6bits in size and the upper 2 bits are used for selecting ROM enable or RAM enable?
I do look forward to seeing what you come up with and hope that I can look at it and try to understand "why" it works and how it works.
kyuusaku wrote:
Edit: not good, 5V CPLD are really expensive all the sudden and 3.3V aren't sure to work. I think discrete chips would actually be cheaper unless old stock CPLD can be found in bulk o_0
Use the XC95144XL (~$6). It has 5V tolerant inputs, and you can do an easy tristate/pullup to get 5V outputs. I am sure it will work because a board coming out very soon uses it
If you use some external registers like the HC670 then the XC9572XL gets down to ~$2.
If a board coming out soon is reconfigurable, there's no reason to make another in the first place. And I dislike the pullup method, I guess if the only outputs are to ROM and they rise higher than 2V, it'll probably be OK without them.
It isn't a board for developers, doubt homebrewers will wanna pay the $85 for full CIB game to throw away most of it!
A through hole discrete logic board still may be a good idea because the CPLDs aren't really hand solderable except for the plcc 44s.
Hand solderable sounds fun to me. As long as all the parts required don't add up to some prohibitive cost then it'll be great. Plus it'll have that cool appearance with all the chips packed in there like that picture of the pirate cart.
I do think designing the discrete version would be a good project still. But then again I really want to be able to make FME-7s so I'd say that anyway. Ofcourse if you have pre-made boards with CPLDs for sale I guess that's ok too. But the discrete version seems more useful since anyone could build it and there is no mystery behind it then.
I designed most of a discrete FME-7 but the CPU mapping is more of an issue than I thought... the easiest thing to do would be to remove $6000 bankswitching and just fix it to WRAM. Thats the only thing not implemented, but it may need debugging.
Currently it uses:
2.5x 74 (two can be switched to 161 which are cheaper, 0.5 can be dropped if one screen mirroring isn't needed)
4x 191
2x 139
1x 161
6x 670
2x 00 (just for mirroring... one can be dropped if one screen isn't needed)
6x pullups (for fixed bank... yeah.. but that is what the pirate does too)
I'm guessing it will take 3 more chips at most bringing the total to approx. 20.
Hopefully you can work out the $6000 issue. It would be nice to be able to map ROM or WRAM there like the FME-7 original. 20 chips isn't too bad I guess. =) I wonder what all the components will cost for building a cart. Not that it really matters as anything beats butchering the supply of Batman RotJ cartridges.
You are looking at $11-12 in parts, $3-4 in pcb fab for low quantity. Half of that parts cost is the 670s, possible to use a cheapo sram instead? Something like a 5V 62256 is in the $2 range now.
Use SRAM instead of the register files? I don't see how in any way that makes sense.
The PRG mapping is kinda a big issue; internally the FME-7 probably isn't arranged as a register file so it's hard to fit it into one. By building a custom mapper, all this stuff can be taken care of so easily ;) Every time I build something of 74 series chips I'm astonished how poor the selection is...
Something ghetto I've thought of is using a 8K ROM as a poor man's PAL to take care of the random logic ;) I think it's fast enough though this is really not a good idea heh
Just curious, does the FME-7 style of mapper require more logic than the VRC4? I wonder about that because it affects the way you write and the size of your mapper functions when you have to select a register and write it through that two address system where as with the VRC4 it's a good bit faster, except for the 4bit CHR registers. Mainly curious as other than MMC5, FME-7 and VRC4 are the most "powerful" mappers. I haven't ever seen inside a pirate konami VRC game cartridge but I wonder if they already did the work of creating a discrete version of the VRC4 similar to how a FME7ish discrete was made in that other pirate cart linked to here somewhere.
FME-7 uses slightly more logic, but the VRC4 would require even more chips.