PPU timing? - NESdev BBS

PPU timing?
by Zepper on 2016-01-07 (#162110)

note: i'm discussing the ntsc standards, as 1 cpu cycle = 3 ppu dots.
For the 1st time, I played with the PPU timing. What's up? Well, there's an unfinished discussion about clocking the PPU (running for 3 cycles) before or after a CPU clock. Since a few games have flickering scorebars, I decided to play with the PPU timing. For my surprise, things are "completely fixed" when the number of PPU cycles alternates between 3 and 2 for every CPU cycle.

Anything about it?

Re: PPU timing?
by Drag on 2016-01-08 (#162118)

The CPU's clock signal is divided from the same crystal the PPU's clock signal comes from, and the CPU's clock is 1/3 the PPU's clock, so there should always be 3 PPU cycles between each CPU cycle.

Shaking status bars usually come from a badly-timed screen split, so there may be a problem somewhere else, like with the timing of the scanline counter, the CPU's interrupts, etc. There should definitely always be 3 PPU cycles to every 1 CPU cycle though.

Re: PPU timing?
by Joe on 2016-01-08 (#162140)

Zepper wrote:

Well, there's an unfinished discussion about clocking the PPU (running for 3 cycles) before or after a CPU clock.

It doesn't matter which you choose, as long as you write it so that events where one depends on the other (reads, writes, NMI, etc.) occur as if the two were running in parallel.

Zepper wrote:

For my surprise, things are "completely fixed" when the number of PPU cycles alternates between 3 and 2 for every CPU cycle.

Since slowing down the PPU makes status bars appear normal, you probably aren't giving your CPU enough time to respond to the interrupt. You might be triggering the interrupt at the wrong time, or spending too many CPU cycles on receiving the interrupt.

Re: PPU timing?
by Zepper on 2016-01-09 (#162190)

With respect, but keep saying "running in parallel" sucks. A lot of picky cases of reading $2002 1 cycle before/after VBlank, or cases regarding APU reading ($4015) make me think that clocking the PPU before/after the CPU MAKES difference in emulation. As I said, I already debugged the test suites for PPU/APU like crazy, and my conclusion is still correct!

Re: PPU timing?
by Joe on 2016-01-09 (#162216)

If you clock the CPU first, then when you clock the PPU, the PPU needs to decide what the CPU will see if it performs a read during its next cycle, because the current CPU cycle has already finished.

You don't have to run them in parallel, you have to make them look like they're running in parallel.

Re: PPU timing?
by Zepper on 2016-01-09 (#162219)

Joe wrote:

(...) the PPU needs to decide what the CPU will see if it performs a read during its next cycle (...)

How so?

Re: PPU timing?
by Disch on 2016-01-09 (#162227)

I'm confused what you're actually asking, Zepper.

You employed a hack and it happened to fix some games, and now you're asking if the hack is correct? The answer to that question is "no". There are definitely 3 PPU cycles to every 1 CPU cycle.

If you want to figure out why games have a shaking screen in your emu, I would dump a trace log and look at the timestamps of when interrupts and PPU writes are happening. Shaky status bars should be pretty easy to spot in a tracelog if you include scanline/dot number in the log. If you count the cycles between interrupt and PPU write you should be able to see the window the game is working with, and should be able to spot where the timing in your emu is off.

Re: PPU timing?
by Zepper on 2016-01-10 (#162264)

Disch wrote:

I'm confused what you're actually asking, Zepper.

You employed a hack and it happened to fix some games, and now you're asking if the hack is correct? The answer to that question is "no". There are definitely 3 PPU cycles to every 1 CPU cycle.

It's about executing ppu_run()->cpu_run(), or the reverse. I'm saying the order does matter for picky readings, mostly $2002 and $4015. Well, ANes guy (or aNes?) had great results when he did such change (ppu first, then cpu). If someone keeps saying "runs in parallel", it's not making sense to me, at least in C. :oops:

Got it?

Quote:

If you want to figure out why games have a shaking screen in your emu, I would dump a trace log and look at the timestamps of when interrupts and PPU writes are happening. Shaky status bars should be pretty easy to spot in a tracelog if you include scanline/dot number in the log. If you count the cycles between interrupt and PPU write you should be able to see the window the game is working with, and should be able to spot where the timing in your emu is off.

You mean NMIs..? The shaking/flickering scorebars occur due to a "bad timing" regarding $2006 (and possibly $2005) writes. On other side, it could be my gfx engine.

Re: PPU timing?
by Joe on 2016-01-10 (#162270)

Zepper wrote:

It's about executing ppu_run()->cpu_run(), or the reverse. I'm saying the order does matter for picky readings, mostly $2002 and $4015. Well, ANes guy (or aNes?) had great results when he did such change (ppu first, then cpu). If someone keeps saying "runs in parallel", it's not making sense to me, at least in C. :oops:

Got it?

The order of execution is fundamental to an emulator's design. It should be no surprise that changing it makes a big difference in how things work.

Let's say I have an emulator that runs the CPU first, then the PPU. When the CPU performs a write to $2002 during its time slice, the PPU will receive that write during its time slice. But what about a read from $2002? If the PPU tries to return the value during its time slice, it will be too late: the CPU has already finished, so it can't receive the correct value.

Now let's see what happens if I put the PPU first, and then the CPU. When the CPU performs a read from $2002, the PPU has already executed and can return the correct value. But what about a write to $2002? The PPU has already finished, so it won't receive the write until its next time slice.

In order to get timing to work correctly, you must solve that problem. Changing the order of CPU and PPU execution does not solve that problem, it only changes where the problem occurs.

Re: PPU timing?
by Zepper on 2016-01-10 (#162275)

Write to $2002? Never do.

Re: PPU timing?
by zeroone on 2016-01-10 (#162290)

Zepper wrote:

Write to $2002? Never do.

Joe did not mean that PPUSTATUS ($2002) is a register to which data can be written. Rather, he was referring to updates to the sprite overflow flag, the sprite 0 hit flag and the vertical blank flag.

Re: PPU timing?
by Zepper on 2016-01-10 (#162298)

"Solving" the problem only has 1 solution... AFAIK, PPU clocks before a CPU read; and after a CPU write. I'll explain with examples.

- Reading $4015 or $2002 seems to work only if the PPU is clocked first. Example: the PPU sets flags (sprite zero, VBlank) and the CPU catches them in the subsequent read, and not before.

- Writing to $2000 or $2001 seems to take immediate effect if the PPU is clocked after the CPU write cycle, and not before. Example: setting the screen ON should be done after the CPU write, because if done before, the PPU wouldn't catch the setting during the first 3 PPU cycles.

Correct me if I'm wrong about this. However, the shaking continues. :mrgreen:

Re: PPU timing?
by Disch on 2016-01-10 (#162312)

This all seems hacky and unnecessary.

Doing things "before" the CPU or "after" shouldn't matter as long as you're consistent.

It's confusing to visualize what happens when the CPU and PPU try to access something at the same time -- so just assume they have one of the bootup syncs where they never do:

Code:

P = ppu cycle
C = cpu cycle

PPPCPPPCPPPCPPPC

You might look at this and say "he's running the PPU first" .. but this sequence could just as easily be rewritten as:

Code:

CPPPCPPPCPPPCPPP

And now the CPU is first. There's no difference -- they're the exact same thing.

The only thing that matters is when something happens on one P cycle, how it impacts the surrounding C cycles (and vice versa). Logically, something that happens on a P cycle can't impact a C cycle that is already complete, so you only have to worry about the next one:

Code:

     PPU generates NMI here
      |
      V
CPPPCPPPCPPP
        ^
        |
        |
      CPU 'sees' NMI here (but won't actually perform
          it until this instruction is complete)

Notice how I "ran the CPU first" here... but this logic doesn't change even if you run the PPU first:

Code:

    PPU generates NMI here
     |
     V
PPPCPPPCPPPC
       ^
       |
       |
     Exactly the same .. even though PPU is "run first"

On this abstract level, it really isn't any more complicated than that. You're making this problem out to be way harder than it really is.

In all liklihood, you're seeing shakey status bars because you have an off-by-1 error somewhere in your code. Maybe you're firing an NMI one ppu cycle too early/late or something like that. That could be easily exposed if you dump a tracelog and examine it.

If you make a trace log that tells you exactly what cycle NMI happens on, and what cycles the CPU is writing to scroll regs, or reading from $2002 -- and compare that to a disassembly of what the code is actually doing, you should be able to spot where your emu is going wrong.

----------------------

Or... if you want to run the CPU as "half-cycles" you might have a pattern like this:

Code:

P = PPU cycle
r = rising edge of CPU clock
f = falling edge of CPU clock

rPfPPrPfPPrPfPP

or

rPPfPrPPfPrPPfP

This might help if you find that writes take effect on the rising edge of a CPU clock, while reads take effect on the falling edge. But I *think* they both take place on the rising edge, so this is probably unnecessary. You could verify that with Visual2a03 + visual2C02 if you want.

Re: PPU timing?
by Zepper on 2016-01-11 (#162329)

Let's say a "sprite hit event" occurs at PPU cycle N. If a $2002 read occurs before N, the sprite hit will be off up to 3 PPU cycles. Else, the sprite hit flag will be set on $2002 read. It's a difference of 1 CPU cycle or up to 3 PPU cycles.

Regarding the shaking, even zeroone had such conclusion - running the CPU prior to PPU fixes the shaking/flickering in a few games, but the general timing is broken and would be needed to be adjusted.

Re: PPU timing?
by Disch on 2016-01-11 (#162335)

Zepper wrote:

Let's say a "sprite hit event" occurs at PPU cycle N. If a $2002 read occurs before N, the sprite hit will be off up to 3 PPU cycles. Else, the sprite hit flag will be set on $2002 read. It's a difference of 1 CPU cycle or up to 3 PPU cycles.

So what you're saying is.... if the CPU reads $2002 before the sprite hit flag is set, they won't see it as set.

That's obvious. More than that... that's how it should be. $2002 reads should not see into the future. Why do you think this behavior needs to be changed?

Quote:

Regarding the shaking, even zeroone had such conclusion - running the CPU prior to PPU fixes the shaking/flickering in a few games, but the general timing is broken and would be needed to be adjusted.

That's great and all, but it doesn't explain the behavior.

An off-by-1 error seems much more likely to me than requiring CPU reads to look into the future.

Like I said before... make a tracelog. Count the number of cycles between NMI/$2002 and the scroll change. That will let you know what kind of window you're working with. Odds are, the window is really small, and the game is falling outside of the window for some frames resulting in a shaking status bar. But seeing all the data should give you an indication of what behavior SHOULD be happening vs. what your emu is actually doing, and that will give you an idea of how to fix it. As of now, you seem to just be trying random stuff in hopes that it will work. You can't solve a problem if you don't know what the problem is.

Re: PPU timing?
by James on 2016-01-11 (#162350)

Which games are you having problems with?

Re: PPU timing?
by Zepper on 2016-01-11 (#162352)

29780 cycles between NMIs, always alternating by 1 cycle (even/odd frames).

James wrote:

Which games are you having problems with?

The Simpsons: Bart vs Space Mutants (shaking scorebar)
Kick Master (waving title screen)
Battletoads (hangs during stage 2)

Re: PPU timing?
by zeroone on 2016-01-11 (#162354)

Zepper wrote:

The Simpsons: Bart vs Space Mutants (shaking scorebar)

If I recall correctly, this has to do with handling the sprite 0 hit too early or too late.

Re: PPU timing?
by Hyde on 2016-01-11 (#162355)

Zepper wrote:

29780 cycles between NMIs, always alternating by 1 cycle (even/odd frames).

Odd frames should be 1 PPU (not CPU) cycle shorter when rendering is enabled.

Re: PPU timing?
by lidnariq on 2016-01-11 (#162356)

341×262 = 89342 PPU cycles; ÷3 = 29780⅔ CPU cycles
341×262-1 = 89341 PPU cycles; ÷3 = 29780⅓ CPU cycles.

There might be some funny edge cases going wrong by rounding these thirds up and down, but it's on average correct.

Re: PPU timing?
by Zepper on 2016-01-11 (#162358)

I could make available my PPU C source code for help. Anyone interested in taking a look?

Re: PPU timing?
by Disch on 2016-01-11 (#162364)

Zepper wrote:

29780 cycles between NMIs, always alternating by 1 cycle (even/odd frames).

No... I mean...

the game is changing the scroll to split the screen. How many cycles are between that and the last CPU interaction with the PPU (NMI? Sprite 0 hit?)

If the game is taking 500 cycles between when it detects sprite 0 and when it change the scroll, but your emu only gives 499, then that would cause shaking. You need to look at what the game is doing and how it is doing it so you can see where your emu is going wrong.

Re: PPU timing?
by Zepper on 2016-01-11 (#162366)

Fine, but I have no reference about the number of cycles between a sprite #0 hit detection ($2002 reads $40 set) and $2006 (second write). Is the following helpful? (CPU cycles between $2002:$40 and $2006 sec.write)

Code:

-frame-
.$2006 (56)
.$2006 (638)
.sprite zero hit
.$2006 (114)
-frame-
.$2006 (59)
.$2006 (641)
.sprite zero hit
.$2006 (114)
-frame-
.$2006 (59)
.$2006 (641)
.sprite zero hit
.$2006 (114)
-frame-
.$2006 (61)
.$2006 (643)
.sprite zero hit
.$2006 (114)
-frame-
.$2006 (57)
.$2006 (639)
.sprite zero hit
.$2006 (114)
-frame-
.$2006 (60)
.$2006 (642)
.sprite zero hit
.$2006 (114)
-frame-
.$2006 (57)
.$2006 (639)
.sprite zero hit
.$2006 (114)
-frame-
.$2006 (56)
.$2006 (638)
.sprite zero hit
.$2006 (114)
-frame-
.$2006 (57)
.$2006 (639)
.sprite zero hit
.$2006 (114)
-frame-
.$2006 (58)
.$2006 (640)
.sprite zero hit
.$2006 (114)

Re: PPU timing?
by Disch on 2016-01-11 (#162368)

Quote:

I have no reference about the number of cycles between a sprite #0 hit detection ($2002 reads $40 set) and $2006 (second write).

That's what you need.

There are 2 critical times:
A) PPU->CPU interaction (NMI / Sprite 0)
B) CPU->PPU interaction (scroll change)

So you need to answer the following questions:

1) How many cycles does the game need between A and B?

2) How many cycles is your emu giving? (it's probably slightly less, or might be more)

3) Why?

Until you can answer those questions, you don't know what is going on and any "solution" you get will be a hack or just dumb luck. You can't solve this problem without knowing what the problem is.

EDIT:

All you need here is a tracelog of CPU execution for a frame that includes timestamps. Something that walks through each instruction that the game is doing, prints register contents, and tells you the time (scanline+dot) of the instruction. That should give you enough info to start putting the pieces together here.

Re: PPU timing?
by Zepper on 2016-01-11 (#162371)

Fine. I'll try.

Re: PPU timing?
by Disch on 2016-01-11 (#162376)

Also if you can mark the time at which the scroll updates (Y scroll increment and X scroll reset) that would help too. That way you can see why if the $2005/6 writes are coming in too early or too late.

EDIT:

To clarify.... if the game is using Spr0 hit:

time 'A': PPU detects sprite 0 hit, raises flag in $2002
time 'B': CPU reads $2002, sees that flag is set
time 'C': CPU does $2005/6 writes to change the scroll
time 'D': PPU updates scroll at end of scanline

D-A is the 'window' that your PPU is giving

So if the game is expecting that window to be 500 cycles, but in your emu it is only 499 cycles, the you might get shakiness because the game might take too long to change the scroll.

Or... if the game is using IRQs, time 'A' / 'B' would be IRQ generated / IRQ taken.

Re: PPU timing?
by zeroone on 2016-01-12 (#162416)

@Zepper This log may help. The status bar in The Simpsons - Bart vs. the Space Mutants does not shake in my emulator. Maybe you can do a diff and figure out why.

Re: PPU timing?
by Zepper on 2016-01-12 (#162437)

zeroone wrote:

@Zepper This log may help. The status bar in The Simpsons - Bart vs. the Space Mutants does not shake in my emulator. Maybe you can do a diff and figure out why.

No

Found the reason, but I have no clue why.
A taken non-page-crossing branch (BVC #$FB) is executing an extra cycle. This is messing up the sprite zero timing. Isn't that extra cycle correct!? Games like Mega Man V, Kick Master and The Simpsons have no more flickering.

Re: PPU timing?
by tepples on 2016-01-12 (#162438)

Not taken: 2 cycles
Taken: 3 cycles
Taken, different page from not taken: 4 cycles

Re: PPU timing?
by Disch on 2016-01-12 (#162439)

Zepper wrote:

Found the reason, but I have no clue why.

If you don't know why, then you don't know that it's the reason. You're just guessing. And you're probably guessing incorrectly.

Quote:

A taken non-page-crossing branch (BVC #$FB) is executing an extra cycle. This is messing up the sprite zero timing. Isn't that extra cycle correct!?

Yes, BVC takes 3 cycles if a branch is taken and does not cross a page.

So that must not be the reason.

Zepper -- you can keep trying random things and hoping that you'll stumble across the solution, or you can do some real debugging and just find out what the problem is. I'm serious -- make a trace log. If your emu doesn't have a logger... write one. You need to know what your emu is doing when you run this game so that you can see what it's doing wrong. Without that information you're flying blind, and solving this problem will be impossible.

Re: PPU timing?
by Zepper on 2016-01-12 (#162440)

Calm down, please. I have a CPU/PPU log system... and I detected that such extra cycle messes up the sprite zero timing. Removing the extra cycle makes everything fine. I'm tracing the code, not "guessing" as you :twisted:

think.

tepples wrote:

Not taken: 2 cycles
Taken: 3 cycles
Taken, different page from not taken: 4 cycles

Yes, 4 cycles = 3 + 1 cycle penalty. ^_^;; Correct.

Re: PPU timing?
by Disch on 2016-01-12 (#162447)

Sorry if I'm getting to aggressive, Zepper. I get a little carried away sometimes. :mrgreen:

It's less likely that your CPU timing is off (those kinds of errors would likely be exposed by test ROMs), and more likely that your CPU<->PPU interaction timing is off. So the idea of BVC being off by a cycle is very unlikely. It's more likely your sprite 0 hit is being raised too late or something similar.

If you have the logs, look at the following:

A) What scanline+cycle is Sprite 0 being hit?
B) What scanline+cycle is the CPU reading $2002 and seeing that result?
C) What scanline+cycle is the CPU changing the scroll?
D) What scanline+cycle is the PPU increasing Y scroll?

If you have all that information logged, you should be able to see where the problem is.

D-A provides a "safe" window.. and if C happens before D, then the game will work properly.
If C-B takes too long, it spills outside the window, and causes the scroll change to be pushed back a scanline, resulting in a shaking status bar.

I'm suggesting that your D-A window is too small. Either you're raising sprite 0 too late, or you're increasing scroll too soon. That seems much more likely to me than you being off by an entire cycle in your CPU instructions.

So yeah, if you have the logs, this should be straight-forward to spot, and you can start diagnosing from there. :beer:

Re: PPU timing?
by Zepper on 2016-01-13 (#162463)

Code:

frame <number>
(line,ppu_cycle) action
* acknowledge 1 means reading $2002 1 cycle after the sprite zero hit.
--------------------------------------------------------
frame 1
(195,249) .hit 
(195,250) .acknowledge 1
(195,256) .Y inc
(196,254) $2006 ** v(02C0) <- t(02C0)
(196,256) .Y inc

frame 2
(195,249) .hit 
(195,256) .Y inc
(195,261) .acknowledge 4
(196,256) .Y inc
(196,265) $2006 ** v(02C0) <- t(02C0)

frame 3
(195,249) .hit
(195,256) .Y inc
(195,259) .acknowledge 4
(196,256) .Y inc
(196,263) $2006 ** v(02C0) <- t(02C0)

frame 4
(195,249) .hit
(195,256) .Y inc
(195,258) .acknowledge 3
(196,256) .Y inc
(196,262) $2006 ** v(02C0) <- t(02C0)

frame 5
(195,249) .hit 1
(195,256) .acknowledge 3
(195,256) .Y inc
(196,256) .Y inc
(196,260) $2006 ** v(02C0) <- t(02C0)

frame 6
(195,249) .hit 3
(195,256) .Y inc
(195,264) .acknowledge 5
(196,256) .Y inc
(196,268) $2006 ** v(02C0) <- t(02C0)

Re: PPU timing?
by zeroone on 2016-01-13 (#162464)

Can you sort those by (line, ppu_cycle)?

Re: PPU timing?
by Zepper on 2016-01-13 (#162481)

(the game in subject is The Simpsons - Bart vs Space Mutants, with flickering scorebar)
I see that frames 2,3,4 seems to be the "correct" situation. The flickering occurs because of an early $2006 second write. The game detects sprite #0 hit and switches to a scroll-related subfunction. Since Y increment occurs after the write, the scorebar flickers. Problem is... You see the sprite #0 occuring always at same line & PPU cycle, but the CPU is "losing sync" with the PPU (off by 1). Since my emu get a PASS in all those test ROMs (CPU and PPU timing, including sprite stuff timing), I don't know how to solve this puzzle, but I hope to make it. ^_^;;

zeroone wrote:

Can you sort those by (line, ppu_cycle)?

Why?

Re: PPU timing?
by Disch on 2016-01-13 (#162483)

These logs are great and tell us everything we need. Thanks Zepper. :mrgreen:

Looks like the problem was the opposite of what I was expecting. Your window is too large.

Frames 2-6 seem to be running as the game expects, but frame 1 is changing the scroll too early. It should be happening after the Y inc -- not before it.

Could be that your sprite 0 hit is happening just a cycle too early. Are you adding the 1 cycle delay for the pipelining effect?

http://wiki.nesdev.com/w/images/d/d1/Ntsc_timing.png

Note where it says: "Sprite zero hits act as if the visible image starts at h = 2 (i.e., the sprite 0 flag will be set during the third tick of a scanline at the earliest)"

Re: PPU timing?
by zeroone on 2016-01-13 (#162486)

Disch wrote:

Note where it says: "Sprite zero hits act as if the visible image starts at h = 2 (i.e., the sprite 0 flag will be set during the third tick of a scanline at the earliest)"

Does that note mean that the sprite 0 flag is set 2 PPU cycles after the hit actually occurs?

Re: PPU timing?
by Disch on 2016-01-13 (#162503)

No. The numbering on that page is based on what I assume is the numbering used in Visual2C02, which renders pixels on dots 1-256, and not 0-255 like you might be thinking.

So the very first pixel of the scanline is output on h=1, which means the earliest spr0 can hit is h=2. So only a 1 cycle delay.

(this might also explain why spr0 never hits when h=256 ... as the pipelining delay pushes it outside that "phase" of ppu operation)

Re: PPU timing?
by Zepper on 2016-01-13 (#162508)

Delaying by 1 CPU cycle fixes the game & still got a PASS in sprite_hit_timing test. But I got a FAILED #2 with "sprite hit left clipping" test. Also fails in "sprite hit order FAILED #3" and "edge timing FAILED #2".

Re: PPU timing?
by zeroone on 2016-01-13 (#162512)

Zepper wrote:

Delaying by 1 CPU cycle fixes the game & still got a PASS in sprite_hit_timing test. But I got a FAILED #2 with "sprite hit left clipping" test. Also fails in "sprite hit order FAILED #3" and "edge timing FAILED #2".

What about a delay of 1 PPU cycle?

Re: PPU timing?
by Disch on 2016-01-13 (#162514)

Zepper: As zeroone said... 1 CPU cycle is too much. It should only be 1 PPU cycle

Re: PPU timing?
by Zepper on 2016-01-13 (#162516)

Disch wrote:

Zepper: As zeroone said... 1 CPU cycle is too much. It should only be 1 PPU cycle

Not good.

If there are 3 or 2 PPU cycles left, there's no difference in delaying it by 1. If there's only 1 PPU cycle left, well... the sprite hit would take effect only in the next CPU cycle.
Still no lucky here... still trying...

Re: PPU timing?
by Disch on 2016-01-13 (#162520)

Quote:

the sprite hit would take effect only in the next CPU cycle.

But isn't that what you need? From the log you posted, your sprite 0 hit is just 1 cycle early, so if you push it forward a cycle, wouldn't that solve it?