lidnariq wrote:
I mean, sure, if you insist on designing your game to rely on being able to brute-force update the entire visible game state on every 4th frame, you can design a cart that has enough RAM to support it. But there was a licensed port of Boulder Dash published on the NES at it doesn't use anything resembling such heroics.
I've not played the NES Boulderdash nor seen any frame-accurate recordings of it. Does ensure that all displayed tiles get updated on the same frame? Because the tile cycling in Boulderdash doesn't involve motion between tiles, it probably wouldn't matter visually if some tiles were updated on one frame and some were updated on the next, so I'd guess that's probably what happens. Achieving the smoother motion shown in the Ruby Runner .gif I posted would require, however, that all tile updates occur synchronously with the switch from the last tile set to the first. I think that would be probably achievable even using a basic CNROM cart, but if other mappers could make it easier that would be nice to know.
Quote:
Most of the time, a design that uses the two nametables as a means of double-buffering is trying to make the NES act like some other console rather than work within the limited bandwidth.
I'd say that would depend on whether that whether the desired play mechanic could be achieved better some other way on the NES. I think the NES hardware would seem like an excellent fit for Ruby Runner save for the difficulties updating name-table RAM, and even with those difficulties I would think it would be workable.
Quote:
And to be fair, when I optimized Driar from its original SGROM release down to NROM, I did something similar, using 1K of the CPU's RAM to hold fully-unrolled copying code to do updates to nametables, to work around no longer having meaningful CHR bankswitching.
Not familiar with that game.
Quote:
Quote:
A mapper which could include some dual-port storage (interleaving PPU and CPU cycles) could make things much more convenient, but might be seen as cheating even though some FPGAs which include 7K of RAM cost less than $2. Unfortunately, those parts all have evil packages, and have inputs that are 3.3V tolerant but not 5V tolerant.
Unfortunately those cheaper ones might not have enough I/O pins. If you're willing to limit it to just licensed NESes (no famiclones) and are willing to guess where the ALE cycles are to demultiplex the PPU's address bus and are willing to make the CPU side interface a PITA, you need at least 10(PPU A9, A8, PPU AD7..0)+2(PPU /RD, PPU /WR)+8(CPU D7..0)+2(CPU A0,1)+2(M2,R/W)=24 IO pins. While there are iCE40UL parts in that range, one'd probably prefer to have all the CPU/PPU address/data pins to make the programmer's life less miserable. Which gets us back to the iCE40xx1K parts. At least some come in a TQFP...
The I/O requirements for main CPU interfacing could be reduced by 5 if one adds a 74HC299 universal shift register (reads and writes will be separated by at least 3 main-CPU clocks that don't read or write the register, giving the FPGA enough time to get data to/from the shift register).
As I think about it, though, I wonder if the best way to make a cheap but versatile Nintendo cart might be to adapt the same approach used by the Atari 2600 melody cart, using one 70MHz ARM7TDMI or similar device on each bus, and maybe running an SPI port between them.
Quote:
Quote:
Is there any way to use burst DMA to update anything other than OAM entries?
Nope.
You can map your own cart device to additionally listen to writes to $2004, but that's it.
That seems like a missed opportunity in the NES design. If the same 6502 address had been used for OAM and PPU data, with the set-address write selecting which kind of data would be written, that would have freed up a 6502 address while also enhancing the usefulness of DMA. Oh well.
Quote:
Quote:
Having a piece of code treat a data structure as occupying 128 banks of 256 bytes each is simpler than trying to have it treat data as two banks of 64 pages of 256 bytes each.
I must be missing something... how does being able to bankswitch on A8 and up help with this particular transformation?
Because the upper byte of the 6502 address will be constant.
If one has a 64KiB data structure on a cart starting at address $010000 which is using an 8K banked region from $8000-$8FFF, and wants to fetch a byte given at offset X:Y, the required code would be something like:
Code:
sty temp
lda
txa
lsr
lsr
lsr
lsr
lsr
ora #8
sta $8000
txa
and #$1F
sta temp+1
lda #0
sta temp
lda (temp),y
as compared with something like:
Code:
stx $FC ; Set bits 8-15 of address for $7C00-$7CFF region
lda $7C00,y ; uses LSB of address, plus last value accessed at $FC, plus $010000.
One could replace the shifts in the first example with a table lookup, but I think the second would still seem a lot easier. The "normal" banking approach would require that the offset be split into an 8-bit part, a 5-bit part, and a 3-bit part, rather than simply being kept as two eight-bit parts. The page-level granuarity could be especially useful if one had multiple adjacent banking regions. If one wanted to load x, y, and a with three consecutive bytes at an offset specified by x:y, the code could be something like:
Code:
stx $FC ; Set bits 8-15 of address for $7C00-$7CFF region
inx
stx $FD ; Set bits 8-15 of address for $7D00-$7DFF region
ldx $7C00,y
lda $7C01,y
sta temp
lda $7C02,y
ldy temp
Note that this code will work even if the object crosses a page boundary. Compare that to what would be needed to fetch three consecutive bytes using normal banking if one had to allow for the possibility of crossing a block boundary.
Quote:
Quote:
Have you seen any Nintendo mappers based on page-level banking?
No licensed mappers used anything finer than 8 KiB. And to the best of my knowledge, the finest banking seen in any pirate mapper hack is 1KiB.
Bummer. Page-mapped regions are really nice to work with.