I've started work on a NSF player, and I'm confused about a part of the nfs spec about playback. The spec says to repeatedly call the play address many times per second. I can call it many times, and each time it'll run for a while until it hits an RTS, so this all looks fine so far. But now I'm trying to figure out when to sample it for audio data, and I'm not sure what to do.
I'm not sure what is supposed to be happening between these play calls. For example, let's say I need to call the play address every 10ms. I set the PC to the play address and run the emulator until it hits the RTS. Assume it only took 5ms to hit RTS. Do I then have to tick the APU for another 5ms (but not the CPU, since it has nothing to do)? Or are these play routines designed to RTS just about when I need to set the PC back to the play address? Does this make sense?
On an NES, after the RTS from PLAY, it would normally return to some busy wait loop (or the rest of the game logic) and spin there until the next frame.
On your emulator, this is time you don't have to spend emulating the CPU and can just generate the audio only. The audio is output continually, though for correct sound emulation you need to sample while the CPU is active as well.
Some NSFs do not RTS from PLAY, particularly if they generate PCM sound. For these to play correctly you need to have accurate timing between the CPU and the audio generation.
Quote:
I'm not sure what is supposed to be happening between these play calls. For example, let's say I need to call the play address every 10ms. I set the PC to the play address and run the emulator until it hits the RTS. Assume it only took 5ms to hit RTS. Do I then have to tick the APU for another 5ms (but not the CPU, since it has nothing to do)?
Conceptually the CPU is executing other code while it waits until it's time for the next play call. Since it's not the CPU that's generating the sound, it shouldn't matter whether the CPU is running during the waiting time. The APU is running all the time and conceptually generating a sample every ~0.56 us.
Conceptually, the CPU is sitting in a loop waiting for a programmable interval timer clocked at 1 MHz to assert an interrupt. Then the IRQ handler calls the play routine and reads the controller for track selection.
I don't think that's really conceptual, that's more like a practical suggestion for implementation, though it differs from the hardware implementations I'm familiar with:
- The powerpak has a 1MHz timer, but it does not use IRQ. It is polled.
- The TNS-HFC3 uses NMI, so it doesn't have a 1MHz timer.
- CopyNES NSF playback has a timer, but it's polled as well.
The 1MHz timer comes from kevtris's original hardware ("HardNES"), which threw a 2A03, boot ROM, RAM, and an FPGA together.
Not clear where it comes from, ultimately: his pictures are rather low resolution, and I tentatively think I only see the 21.477MHz master crystal.
Anyhow, hardware implementation aside, you will use less of the host CPU time if you don't waste cycles emulating the NES CPU between PLAY's RTS and the next PLAY. There are NSFs out there that do not return from PLAY, though, so keep this in mind (mostly these are modern ones that use PCM sound).
Depending on whether you're interested in playing everything, or just the large "well-behaved" subset of NSFs, you could just do all your CPU emulation by itself, i.e. halt the APU sound generation when you run PLAY to its RTS, thus all the CPU stuff happens immediately at that point. This might simplify the implementation or potentially improve performance.
I think at this point OP is just trying to get it working (or even get a conceptual idea of what happens), not worried about optimization. As you say, running the play routine as if it takes zero CPU cycles, then running the APU for however many cycles until the next play call, is an acceptable initial approach that will work on many tunes and simplifies many aspects of implementation.
Yes, this all makes sense, thanks. I'm indeed just in the get-it-to-work stage before attempting to optimize. My first approach is as you discussed: run CPU and APU until RTS, then APU only until next play time.