Several months ago, as I was toying with SNESGSS, I noticed that if you play a sample close to the maximum pitch (128khz) it starts to alias. I find this kind of fascinating because I thought this was mostly a Genesis phenomenon. I've already found a couple creative uses for this aliasing and bsnes faithfully reproduces it like I hear in SNESGSS, presumably because they use the same or at least a similar audio engine.
Question: I do not (yet) have a flash cart for testing my game on real hardware, so I was wondering if these emulations were hardware-accurate when it comes to aliasing. Is this the case?
My personal (limited) understanding is that aliasing is limited by the Gaussian filter, although some of it still manages to be hearable even with the filter, I guess. So if the Gaussian filter is accurate to the hardware, so is the aliasing.
A good way to play that is to hear the melody track in Zozo (Slam Shuffle song) in Final Fantasy VI. It plays a synth symple that have a LOT of harmonics at very high pitch. Old emulators used to emulate this catastrophically, the only way to make it sound decent was to increase the same rate to 48000+ Hz (the SNES uses 320000). On real hardware, almost no aliasing is hearable, but maybe if the melody was alone (without the other tracks) maybe a little of it would be hearable. My PowerPak is under repair so I cannot test this unfortunately.
There are two kinds of aliasing in play here. One is due to imperfect interpolation; the Gaussian interpolator is pretty good at minimizing this. The other is the Nyquist folding effect, which the interpolator doesn't help with. You need a bandlimiting filter to deal with that kind, and the SNES doesn't have one.
Nononono, I don't think you folks understand. I want the aliasing. It lets me do stuff like LFO-like effects without having to make large samples for it. I just want to be sure that the audio emulation in bsnes aliases the same as a real console.
That's what I'm saying. If the aliasing you hear is due to cranking the sample past 32 kHz, it's Nyquist folding, and it will happen on real hardware.
To be
completely certain, it'd probably be best to test on real hardware, but I'm pretty sure bsnes emulates this accurately. From
2010:
byuu wrote:
blargg's S-DSP core is known to be 100% bit-perfect to real hardware, with the one exception that the mute command is instant, and does not exhibit a very fast fade-out effect.
93143 wrote:
That's what I'm saying. If the aliasing you hear is due to cranking the sample past 32 kHz, it's Nyquist folding, and it will happen on real hardware.
To be
completely certain, it'd probably be best to test on real hardware, but I'm pretty sure bsnes emulates this accurately. From
2010:
byuu wrote:
blargg's S-DSP core is known to be 100% bit-perfect to real hardware, with the one exception that the mute command is instant, and does not exhibit a very fast fade-out effect.
And to clarify, the reason the fadeout isn't emulated and
can't be emulated in a "bit perfect" way is because it's an analog effect. MUTE doesn't actually disable (or affect in any way, AFAIK) the S-DSP's audio output; it's an output pin connected to the amplifier circuit on the SNES motherboard. That's why it's able to mute external audio sources like the SGB/MSU1/Satellaview.
True, analog components don't have a "bit perfect". But they can still be modeled, just as NES APU channel mixing and other analog aspects of the NES and Super NES audio paths are modeled. How fast does this amplifier turn on and off?
93143 wrote:
There are two kinds of aliasing in play here. One is due to imperfect interpolation;[...] The other is the Nyquist folding effect, which the interpolator doesn't help with.
Are you sure ? I thought those were like two sides of the same coin. I could have been wrong.
If it's playing faster than the sample rate then you start losing frequencies. Which ones you lose can even depend on the phase of said frequencies. Interpolation on its own is not going to help you here, that only helps when playing slower, not faster.
The "imperfect interpolation" part concerns frequencies greater than half the input sample rate, for which the Gaussian interpolation somewhat compensates. The "Nyquist folding effect" concerns frequencies greater than half the output sample rate, for which nothing in the S-DSP compensates.
Resampling in the S-DSP can be analyzed as two separate stages: upsampling by a factor of 256 with the Gaussian interpolation, followed by downsampling by a factor of (pitch / 16) with point sampling. When pitch is 8192 to 16383, corresponding to a playback rate from 64000 to 127992 Hz, there's still a substantial amount of energy above 16 kHz that the Gaussian interpolation did not filter out, and this gets reflected at 16 kHz down into the audible range.
For the mute functionality, there's a transistor that rapidly turns the gain down on the output-stage amplifier.
Putting this into LTspice, I see a turn-off time of about 1.4ms, and a turn-on time of about 0.5ms.
("Off" being defined as a gain of -26.5dB; "on" being defined as a gain of +11.5dB)
I had to make substitutions for both the 2SC2412 (I chose the 2SC4081) and the LM324 (I chose the LT1491); these seemed close based on the available parameters.
Here's my LTspice schematic/sim (remove the ".txt" to load it into LTspice)
I think part of the reason why the YM2612 is prone to aliasing has to do with quantized sine tables and not being able to sweep the phase at a smooth enough rate.
The YM2612 doesn't do any attempt at interpolation or anything like that, period (it is a low-end FM chip, after all), and its output is crystal clear too (any filtering comes from the surrounding circuit, which is why it's possible to get better audio output despite keeping the same chip).
tepples wrote:
Resampling in the S-DSP can be analyzed as two separate stages: upsampling by a factor of 256 with the Gaussian interpolation, followed by downsampling by a factor of (pitch / 16) with point sampling. When pitch is 8192 to 16383, corresponding to a playback rate from 64000 to 127992 Hz, there's still a substantial amount of energy above 16 kHz that the Gaussian interpolation did not filter out, and this gets reflected at 16 kHz down into the audible range.
So, if I understand well, this means the Guasian filter/interpolation stills block aliasing in the range of pitch #4096 to #8191, (sample rate of 32kHz to almost 64kHz) even though it is faster than the output sample rate... right ?
Bregalad wrote:
tepples wrote:
Resampling in the S-DSP can be analyzed as two separate stages: upsampling by a factor of 256 with the Gaussian interpolation, followed by downsampling by a factor of (pitch / 16) with point sampling. When pitch is 8192 to 16383, corresponding to a playback rate from 64000 to 127992 Hz, there's still a substantial amount of energy above 16 kHz that the Gaussian interpolation did not filter out, and this gets reflected at 16 kHz down into the audible range.
So, if I understand well, this means the Guasian filter/interpolation stills block aliasing in the range of pitch #4096 to #8191, (sample rate of 32kHz to almost 64kHz) even though it is faster than the output sample rate... right ?
At pitches 4097 to 8191, the input rate still exceeds the output rate. In this case, the Gaussian interpolation should make aliasing
nearly inaudible for most
real-world, non-pathological input waves. If your wave is preemphasized to correct for the Gaussian interpolation's treble cut, and it contains lots of tonal high-frequency energy, there may still be some wraparound.