Accurate and smooth A/V synchronisation in anything on Windows is a shit show. You can ask any of the emulator authors here. The amount of variance in behaviour is absolutely mindboggling, because it's all based on a tremendous number of factors. For example, I see occasional "glitches" in many emulators (the behaviour varies slightly), including MAME, since the introduction of a sound card I use (Asus Xonar DG) that has crappy drivers -- and I use that card because my on-board Realtek audio (a company I want to see burn in hell) has tons and tons of interference/bus noise that the rear-panel microphone port picks up -- but actual audio playback is good (those glitches don't happen). This is further compounded by use of Windows 7 (I had several emulators that worked beautifully smooth on Windows XP, ex. Nestopia, but then began occasionally having problems once I moved to 7). I've said enough, because I could go on and on for days about this subject, all the way down to talking about event timers and SpeedStep/clock frequency throttling.
Back in the "DOS days", none of this was a problem because the OS (for lack of better term) "sat closer" to the hardware and nothing else was running simultaneously. For example, accurate/flawless Vsync was possible back then -- the CPU only ran what you had actively launched. For Vsync, you could literally do this without any worry of "other stuff" going on:
Code:
mov dx, 03dah
Loop1:
in al, dx
and al, 08h
jnz Loop1
Loop2:
in al, dx
and al, 08h
jz Loop2
...and at the end of that, you'd be in VBlank reliably every time. No tearing, no nonsense.
With the introduction of multitasking OSes (not task-switching, but multitasking) like Windows NT and later, and graphical layers like DirectX, and later GPU drivers and so on, everything became abstracted. You now have layers upon layers upon layers of things that all "connect", taking away from the ability for a program to truly do something as simple as, say, wait for Vsync with the same degree of accuracy as back in the early 90s -- you no longer have ANY idea of how much time _really_ is left for use in VBlank by your program. Now introduce hilarious kludges (I consider them a kludge) like nVidia G-SYNC and AMD FreeSync, requiring special "gamer monitors", and have their own complexities/nuances (think about windowed mode applications and how that's going to work). On the audio layer, you now got USB audio bits (USB DACs), sound cards still, and let's not forget weird things like HDMI audio (and HDCP on top of that, whee!) -- keeping everything "synchronised" is a total nightmare. At least on the graphical end of things, are you ready for the irony? Here it comes:
With the introduction of Vulkan, it seems that we're circling back to the 90s. Someone at the Khronos Group had had enough of this bullshit, it seems. Vulkan is an incredibly tiny API (last I looked, around 1.6MBytes) where, from what a colleague of mine who does 3D engines for commercial games told me, basically just says "here's a piece of raw memory that correlates with the framebuffer on the GPU. Do whatever you want with it. If you screw it up and crash the machine, it's your fault. Talk to the hardware directly (or very close to it)".
Hopefully I'll be dead by the time humans come full circle and realise the importance of KISS principle.