Most of the approaches I've seen described for using DMC interrupts for raster timing require compensation for some rather large timing uncertainties. From what I can tell, at least on MESEN and my own console, there's a much easier and more efficient technique. I haven't figured out the best way to quickly get the DMC synchronized with the frame, but once a lock is required it will remain very stable. I've posted a quick test that was sub-optimal, but I've since done some calculations for timings which lock things down much better, and am working on a better demo.
A key observation is that changes to the DMC rate take effect after the current bit is transmitted. If an interrupt occurs and one sets the DMC rate immediately to a slow rate, waits for the bit in progress to get sent, and then sets the DMC to a faster rate, then the first bit will use whatever DMC rate was set previously, the second will use the slow rate, and the remaining six (as well as the first of the next byte) will use the faster rate. If one uses $8E as the fast rate and alternates between $80 and $81 for the slow rate, then every pair of interrupts will take 1816 cycles--only 2.3 cycles short of the 1818.3 cycles required for 16 scan lines.
Using a few other DMC values at the start and end of the frame, it's possible to arrange for a combination of times that will be either 0.5 cycles less than a frame or 1.5 cycles more. Using the former on three out of four frames will result in jitter being within 1.5 cycles of what could be achieved with a mapper's IRQ. CPU overhead will be slightly greater because of the need to perform the second rate-setting write after the first bit gets clocked out, but should only be about 10% even with raster splits every eight lines.
A key observation is that changes to the DMC rate take effect after the current bit is transmitted. If an interrupt occurs and one sets the DMC rate immediately to a slow rate, waits for the bit in progress to get sent, and then sets the DMC to a faster rate, then the first bit will use whatever DMC rate was set previously, the second will use the slow rate, and the remaining six (as well as the first of the next byte) will use the faster rate. If one uses $8E as the fast rate and alternates between $80 and $81 for the slow rate, then every pair of interrupts will take 1816 cycles--only 2.3 cycles short of the 1818.3 cycles required for 16 scan lines.
Using a few other DMC values at the start and end of the frame, it's possible to arrange for a combination of times that will be either 0.5 cycles less than a frame or 1.5 cycles more. Using the former on three out of four frames will result in jitter being within 1.5 cycles of what could be achieved with a mapper's IRQ. CPU overhead will be slightly greater because of the need to perform the second rate-setting write after the first bit gets clocked out, but should only be about 10% even with raster splits every eight lines.