Im planning to change my cpu core to a "cycle by cycle" one, is it a good idea?
Ha ha ha ha ha ha ha ha ha ha!
.....
Ha ha ha ha ha ha ha ha ha ha! Ha ha ha ha ha ha ha ha ha ha!
Sorry, Anes but I almost passed out when I read that post. Read here for your answer.
http://nesdev.com/bbs/viewtopic.php?t=547
Yes you read it right
x30 slower.
One might not be aiming for speed. Some designs are simpler to implement, debug, and change. Also remember that you can't time a method of emulation, only an implementation of that method. Some implementations are slower than others.
I couldn't fix timing problems while using scanline rendering. Everything was gone after the cycle exact emulation. Plus, the speed wasn't SO bad. Modern machines can run it nicely. Be honest- you won't use a CPU clock less than 800MHz nowadays... -_-;; (NESticle, *bump*, baboo)
WedNESday wrote:
Ha ha ha ha ha ha ha ha ha ha!
.....
Ha ha ha ha ha ha ha ha ha ha! Ha ha ha ha ha ha ha ha ha ha!
Sorry, Anes but I almost passed out when I read that post. Read here for your answer.
http://nesdev.com/bbs/viewtopic.php?t=547Yes you read it right
x30 slower.
Ha.
Fx3 wrote:
Be honest- you won't use a CPU clock less than 800MHz nowadays...
I disagree. I'm a proud user of a 600 MHz Celeron processor.
Fx3 wrote:
I couldn't fix timing problems while using scanline rendering. Everything was gone after the cycle exact emulation.
It's possible to emulate the CPU-affecting aspects of the PPU to 5.4 MHz accuracy, but do the actual scanline rendering instantaneously at some point along the scanline. This way you get full accuracy for anything that doesn't rely on mid-scanline reads and writes, and only minor cosmetic glitches for games that do mid-scanline writes. All the sprite 0 hit and other PPU timing ROMs I've written pass on an emulator written this way.
blargg wrote:
It's possible to emulate the CPU-affecting aspects of the PPU to 5.4 MHz accuracy, but do the actual scanline rendering instantaneously at some point along the scanline. This way you get full accuracy for anything that doesn't rely on mid-scanline reads and writes, and only minor cosmetic glitches for games that do mid-scanline writes.
If you can timestamp all writes to PPU registers and then render pixels in horizontal runs between timestamps, either at the end of a scanline or whenever $2002 is read on an "interesting" scanline, how would you get any sort of cosmetic glitch?
You wouldn't get any cosmetic glitch if you rendered partial scanlines; my point was that you can avoid that complexity without affecting the accuracy of the CPU and game, just with slight cosmetic issues for those few games that rely in mid-scanline changes. The comment was directed at the mentality that you can either run the CPU in uninterrupted bursts of 113/114 clocks and have the resulting low accuracy, or run the PPU every pixel and pay a big performance penalty.
SO... I would had to accept the hillarious "HAHAHA" from WedNESDay after all. No, just don't label cycle-exact emulation as "slow" or anything that blows away all the effort in order to emulate the NES hardware accurately. *I* couldn't fix the timing problems- if you have the magic, great. If a lot of timestamps are useful rather than pixel-exact emu, go laugh a bit... -_-;;
Timestamps are pixel-exact. It's just that emulation of one processor at a time, switching only when processors interact, might be a lot more friendly to unrolling and caching mechanisms.
Quote:
Timestamps are pixel-exact. It's just that emulation of one processor at a time, switching only when processors interact, might be a lot more friendly to unrolling and caching mechanisms.
The "switching only when processors interact" part is important. You don't have to simulate the separate pieces of hardware in lock-step unless they can have unilateral side effects on each other at any time (such as raising interrupts). I don't know much about NES hardware but it sounds like the CPU can unilaterally affect PPU but not the other way around. So you simulate the CPU ahead and pause to let the PPU simulation "catch up" whenever you need to cause a side effect on the PPU or read a port value which depends on what the PPU has been doing up to the CPU's current simulated time.
This concept can be applied even when the two pieces of hardware don't share a clock. The important thing is to identify what kind of side effects are possible from each piece of hardware to each other piece, and ensure (e.g. with timestamps) that all visible side effects are ordered correctly. For example, in an SNES, the CPU and the SPC700 sound chip run on totally unrelated clocks, but any communication between the two requires a port write by one chip *and then* a port read by the other chip. Because both chips have to participate in the communication, you can actually let either of them run "ahead" in its simulation and just log the writes in case the "behind" chip wants to know what values were in the ports at a certain time. Only when the "ahead" chip wants to read from the "behind" chip do you have to context switch.
optimistic devil's advocate post:
If it really is simpler to develop, it doesn't seem particularly crazy to me to go ahead and write it (a learning exercise in itself) and then have it available later on for reference and development purposes when writing a faster implementation if you later choose. This is especially when NES emulation is already such a crowded space. Why not write the best cycle-based NES emulator? Who knows.. you may even end up with interesting techniques if you continue to attack the accuracy+speed problem from a different starting point, and there's a possibility you could even be fortunate enough to first attain high accuracy sooner this way - possibly accelerating development of a faster revision. Even if it can never run full-speed, you can still log all you want and record videos, or even run cores in parallel to check for diffs. If you want to see things run full-speed and accurate enough for generally sane purposes, the easy solution is to download Nintendulator (or whatever) and you're done without the effort.. but if you're going through the trouble of writing one yourself, that's hardly the point anyway.
Fx3 wrote:
SO... I would had to accept the hillarious "HAHAHA" from WedNESDay after all...
Ha ha ha (Cough! Cough!)... Ouch...
I've gone back to the normal method of opcode execution, and I'm pleased to say it only needs 25Mhz. If I did stay with the cycle for cycle version I would need 750Mhz,
just for the CPU alone. So if 1.79Mhz needs 750Mhz, god knows what a PSX/Gamecube emulator would need. Here is a snippet of my cycle for cycle accurate 6502;
Code:
inline void OpticCodeAD()
{
switch(CPU.Cycle)
{
case 0:
CPU.PC++;
CPU.Cycle++;
break;
case 1:
CPU.Address = CPU.Memory[CPU.PC];
CPU.PC++;
CPU.Cycle++;
break;
case 2:
CPU.Address += (CPU.Memory[CPU.PC] << 8);
CPU.PC++;
CPU.Cycle++;
break;
case 3:
CPU.A = CPU.Memory[CPU.Address];
CPU.P &= 0x7D;
if( !CPU.A )
CPU.P += 0x02;
CPU.P += (CPU.A & 0x80);
CPU.Cycle = 0;
break;
}
CPU.CC++;
}
Though I have to admit that Quietust's method is better.
WedNESday wrote:
Ha ha ha (Cough! Cough!)... Ouch...
bump.
WedNESday wrote:
I've gone back to the normal method of opcode execution, and I'm pleased to say it only needs 25Mhz. If I did stay with the cycle for cycle version I would need 750Mhz, just for the CPU alone. So if 1.79Mhz needs 750Mhz, god knows what a PSX/Gamecube emulator would need. Here is a snippet of my cycle for cycle accurate 6502;
Though I have to admit that Quietust's method is better.
bump++;
Well, the latest public release of my emu is simply slower than a turtle, but it was drastically improved. A few overloads were slowing down the things, plus the frame sync problem. I remember from old times when timestamp was used for APU emulation (read: it
failed) and... there was a glitchy game, making the real-time emulation required for proper work. ^_^;;
Anyway, if you w4nn4 sp33d, you should try NESticle! >_<
Fx3 wrote:
Anyway, if you w4nn4 sp33d, you should try NESticle! >_<
And if you want a really old and incompatible emulator, you should try NESticle! >_<
But sometimes acceptable compatibility with 25 percent of ROMs is important if you have a several-year-old computer or a handheld device. NESticle is obsolete, but its philosophy of taking shortcuts to save an order of magnitude of cycles lives on in PocketNES, the GBA successor to LoopyNES.