This little thing has been bothering me for some time now, so I though I should try to get to the bottom of it. It's a bit hard to explain, but basically it's a one pixel/scanline tall, variable length line of "glitchyness" that appear seemingly completely randomly during SMB gameplay, about once a minute on average on my AV Famicom. However, it only happens on some "runs" of the game. Power cycling the system can either make it appear or go away. It also appears in SMB2 for the Disk System in the same manner.
I've been looking on the internet for real hardware videos of SMB showing this and it's not easy to find anything, but here is what I got:
First is this niconico video of some guy playing Mario blindfolded:
http://www.nicovideo.jp/watch/sm17492296http://www.nicozon.net/watch/sm17492296 (no account needed)
http://www.nicozon.net/downloader.html? ... sm17492296 (download link page)
6:45~6:49 you can clearly see it around the middle of the screen.
Again at 7:02.
Second, this shaky video of a US NES playing back a TAS:
http://www.youtube.com/watch?v=T1Ps1O6sZX41:10, you can see it above the "WELCOME TO WARP ZONE" text.
3:05, above the castle.
So...what is going on here? Of course no emulator does this. Does this ever happen on your NES/Famicom? What causes it?
For all I know this has been discussed before, but googling this kind of thing is, of course, impossible.
(I wasn't sure which subforum to post this in, by the way)
It always seems to happen close to a screen transition (when entering a warp, when a level has been finished), so I'm going to guess that it's an artifact from doing something with PPU (e.g. turning off rendering momentarily) when the rendering happens to be in the middle of the screen. Hard to say what exactly, but doppel's SMB disassembly should have the answer. Might be hard to find though if it's not reproducible.
Technically I don't see a reason why it shouldn't also happen on emulators.
Is this glitchy line exclusive to Super Mario Bros? I can't think of any reason at the moment why that line should be occurring at all, since the only time anything is written to the PPU during display is after a sprite #0 hit is detected. And that happens farther up the screen.
Weird, I've never seen it.
I don't think this glitch is exclusive to Super Mario Bros. I'm actually the one who made that second shaky video, and I think the glitch is not code related at all, but power related. Specifically, tiny dips in the supply voltage combined with an oxidized cartridge slot mean the PPU will read incorrect data for a few bytes. That Super Mario Bros. cartridge I was using is fairly temperamental even after cleaning, it needs to be jiggled around in the slot several times to work without showing horizontal or vertical glitchy lines.
I should recreate the recording setup I used there and try to get a less shaky video, but I would want to port the code to the new ARM Launchpad and I haven't gotten around to it. I never got it to work with any other TAS but Super Mario Bros. because I only had room for 6 minutes of controller data on the chip, and streaming more from the PC would eventually cause a missed controller interrupt and desync.
Ah, I didn't see the long glitchy line (I only looked at the short white line). The long line showed up long before the screen transition so my theory is out.
Yes, I've seen the glitches in FDS SMB2 as well... I'm imagining it is power-related, or perhaps the CPU reads the PPU in some odd way mid-screen?
My theory is that the issue has to do with bad pin/edge connector contact.
As has been discussed here on nesdev many times over, both the classic slot-loader NES, the Famicom, and the top-loader NES all have problems relating to cartridge connection (I will be a dick and state boldly that anyone who tells you otherwise is flat out wrong); this has plagued the console since its existence. Sometimes jostling/wobbling can cause a single pin to lose contact (or possibly still have contact but the signal is not as strong) which can cause anomalies of sorts; the results would vary depending upon what the entire system (CPU, PPU, and all other circuitry) was doing at that moment in time (we're talking microseconds here). Such jostling/wobbling could be caused by wiggling of the controller cord, or even stomping of feet against a floor (vibration going up through the table, etc.). Naturally every environment/situation is different, thus troubleshooting this is impossible. The older the carts (dirty/damaged contacts), the older the console (more worn the edge connector), the worse it gets -- even slot-loading NESes with new/replaced edge connectors have this problem (sometimes the edge connector is too tight, as crazy as that sounds).
This would also explain why such anomalies are not seen using emulators but only actual hardware.
Power-related issues, sure -- I can't refute that possibility. But please apply Occam's razor first.
P.S. -- I seriously don't know what it is about the NES/Famicom that drives OCD people to it. I have found this to be the case for decades now; you won't find the quantity of OCD people with any other console. It's utterly creepy. I don't know why things like "strange anomalies" (see OP), blowing on cartridge connectors, turning the console upside-down, etc. can't just be considered part of the whole actual experience of using the console. I had to do such in the late 80s/early 90s, so why is this such a big deal to people now? *shakes head*
Well, I actually replicated this, with a completely different NES, TV, and cartridge in a different city. The glitch is only really visible in certain areas of World 1-2. It also only seems to appear 1/3rd of the time when the console is powered on. Once it's doing it, resetting won't make it stop but powering on/off will. Maybe it only happens on one of the CPU/PPU clock alignments? Uninitialized RAM? Who knows.
And as for why to be OCD about this, it's because the NES is a quirky system and this is a forum full of people who want to emulate those quirks.
Grapeshot wrote:
Well, I actually replicated this, with a completely different NES, TV, and cartridge in a different city. The glitch is only really visible in certain areas of World 1-2. It also only seems to appear 1/3rd of the time when the console is powered on. Once it's doing it, resetting won't make it stop but powering on/off will. Maybe it only happens on one of the CPU/PPU clock alignments? Uninitialized RAM? Who knows.
Taken from the OP's own statements:
http://www.youtube.com/watch?v=T1Ps1O6sZX4 -- ~01:10 -- happens in World 1-2
http://www.youtube.com/watch?v=T1Ps1O6sZX4 -- ~03:05 -- happens in World 8-1
http://www.nicovideo.jp/watch/sm17492296 -- ~06:45 -- happens in World 1-2
http://www.nicovideo.jp/watch/sm17492296 -- ~07:02 -- happens in World 1-2
I suppose another possibility is that it's purely something between the PPU and display (TV) that's happening, possibly at the voltage level or something relating to NTSC. My absolute firm belief is that it's something physical or electrical, and is not a "quirk" or "characteristic" of the CPU or PPU (thus cannot be emulated, but I've been wrong before) because what I see in those videos looks like brief "video signal noise", and it's something I've seen on consoles as well as some arcades. If there is a reliable way to reproduce it ("reproduce" means at least 90% reproducible as consistently as possible), someone with an oscilloscope could probe different spots of the video circuitry (this is not necessarily the PPU itself) and see if they can figure out what's happening.
I've seen stuff like this occasionally. I don't think it's limited to SMB. I just think the PPU wigs out a little sometimes, nothing to do with anything going on in software.
Edit: I was completely wrong about that. Read on!
To make it a little easier to see, I downloaded the capture from NND and isolated the frames where the glitch occurs. Note: frame numbers may or may not be slightly inaccurate, as I merely threw the .mp4 file into a barebones Avisynth script using DirectShowSource and loaded it into Virtualdub, and I noticed that I didn't seem to seek accurately on my machine when scrubbing backwards then forwards again. Still, here you go:
Frame 12172
Frame 12238
Frame 12657
Frame 12797
I find it intriguing that the instances of the glitch in close succession seem to always be on the same scanline, and also seem to pull from the same tile data:
The glitching also seems to have both a vertical and horizontal offset, and once again even though it seems to be different between the two "sets" of frames, it seems to be the same within each "set", and it also seems to be bringing in the other half of the nametable, as with the first set I can easily tell that the righthand "floating" glitch is a sliver of the coins, and when sliding those pixels down to the coins in Photoshop, they don't line up with the normal tiles on the left hand of the screen, however as most of us probably know there are single-wide blocks approximately four blocks to the right of the edge of the screen, which are visible in the second glitch set (and in that set, the two-block-wide glitch to the left of Mario may either be the blocks he's right next to but at a different y-pos, or the two-block-wide column just off the right hand side of the screen that contains a multi-coin block).
Whether this visual analysis helps or not, I dunno. Seems like the glitching is at least a bit deterministic, however.
Here's a screen grab of a glitchy line. It does seem to be patterned, which is interesting.
Perhaps the copying of the horizontal bits from t to v gets messed up for that line (
http://wiki.nesdev.com/w/index.php/The_ ... _scrolling, dot 257). That's the kind of problem that would sort itself out automatically on the next line, provided the next attempt is successful.
Is that a piece of pipe on that line inbetween the 3 and 2 pipes? Given that the rest of the line look like breakable block tiles, that must be from the very top-most row of blocks that you run across when going to the warp zone normally (the pipe tiles would be the ones that Mario's head goes through when doing the world 36-1 glitch).
OT: This is strangely similar to some analysis I did on some footage of the 80's game show Press Your Luck a few years ago (that is unfortunately no longer online), when trying to track down a pattern to slides that would stick as the board would shuffle from one set of projectors to the other. That was possibly even more deterministic than this, however.
LocalH wrote:
Is that a piece of pipe on that line inbetween the 3 and 2 pipes? Given that the rest of the line look like breakable block tiles, that must be from the very top-most row of blocks that you run across when going to the warp zone normally (the pipe tiles would be the ones that Mario's head goes through when doing the world 36-1 glitch).
Looks like it to me. A nametable dump at that point would be interesting.
Something wonky happening to the scrolling bits on that line seems like a decent guess at least. It's interesting that it apparently tends to be on the same line.
The two sets of grabs I posted each exhibit the glitch on two different lines. The first set seems to be exactly in the middle of the fourth block below sprite 0, and the second set seems to be halfway through the bottom tile of the fifth block below sprite 0. Your grab appears to be within one or two lines of the fourth block below sprite 0.
I can't get a nametable dump right now, I don't even have a suitable emulator handy on my computer and I'm about to head out for the night. If someone else hasn't done it by the time I get back to my computer tomorrow then I'll seek out a suitable emu and do it.
LocalH wrote:
Note: frame numbers may or may not be slightly inaccurate, as I merely threw the .mp4 file into a barebones Avisynth script using DirectShowSource and loaded it into Virtualdub, and I noticed that I didn't seem to seek accurately on my machine when scrubbing backwards then forwards again.
Yeah, DirectShowSource can be a bit inaccurate. You might want to transcode to AVI using FFmpeg or something and then use AVISource. And you might want to deinterlace at some point in the chain to decompose each frame into the two fields that make it up, as the NES program operates on a field basis.
I neglected to mention in my first post there is actually one other game I have noticed this on, namely Zelda 2 (on FDS)
I haven't tested it as much as SMB, but it seemed similar enough. It only appeared on 1 or 2 startups (out of maybe 10~15 it took me to play through the game) and I think the line only showed in the sidescrolling scenes, not on the world map.
So, SMB1 (cart), SMB2 (FDS), and Zelda 2 (FDS) is the full list of games I've seen it in.
Do these games have anything in common? (Well, Mario 1 and 2 are obvious, but Zelda?)
I am still pretty damn sure it can't appear in all games, as I have played many others on the same system after noticing it in Mario. Like for example Zelda 1, Metroid, and Kid Icarus on the FDS (beat them all), many runs through Mario 3 and Rockman 2, Kirby's Adventure, Punch Out... and lots of others. None showed glitchy lines ever.
I'm also sceptical about bad connection with the cartridge causing this. It was my first hypothesis, but I have cleaned the SMB cart and it works very stable now. I can wiggle it around when the game is running without any problems. Wiggling it also cannot make the glitchiness appear or go away. Neither can resetting by the way. Only a full power cycle can.
Here's some more video I took of this while testing yesterday.
http://www.youtube.com/watch?v=uPbakGFGceY
tepples wrote:
Yeah, DirectShowSource can be a bit inaccurate. You might want to transcode to AVI using FFmpeg or something and then use AVISource. And you might want to deinterlace at some point in the chain to decompose each frame into the two fields that make it up, as the NES program operates on a field basis.
I thought about transcoding it but I figured the frame numbers weren't really that important. Also, the .mp4 was only 512x384 from the download link posted and as the video looked to have had the fields decimated prior to upload, there were no interlaced frames to separate (another reason why frame numbers weren't important). Anybody got a 480i capture of the glitch occurring? With the ability to see it happening at full NES resolution and frame rate, we may discover that, for example, the glitch only happens on a tile boundary or something similar. Would probably make it easier to pinpoint exactly where in the nametable the glitch is pulling from, as well.
I recall seeing the exact same behavior in Zelda 2 when I was playing it recently. I first wondered if it was a software issue and then later figured maybe it has something to do with the age of the system.
I have had this happen on my old NES when I had it, and it happens on my Famicom playing the SMB cart as well. I do not think it has anything to do with cartridge cleanliness / contact.
I remember seeing it on some famiclones with some pirate carts.
Now I've seen it in Zelda 1, too.
It's a lot less frequent than in Mario. It only happened on a few horizontal screen transitions in the overworld.
Surely this is related to horizontal scrolling somehow?
I've been seeing this on SMB for the longest time, but I've also seen it in Kid Icarus. On Kid Icarus, I only see it in the horizontally scrolling stages, and it seems to occur more frequently on certain playthroughs, and less frequently on others.
This happens both on my Power Pak and on the actual cartridge. Strangely, I don't recall seeing it on any newer games, just really older generation games. I wonder why?
I'm thinking the timing occasionally creates a bus conflict (or some other glitch) when reloading the horizontal scroll bits at x=257. Bus conflicts in NMOS tend to use AND logic, as seen in SAX instruction and
discrete logic mappers, which produces a preponderance of zero bits. Games that don't scroll horizontally are already more likely to leave a zero in the horizontal scroll bits.
Implication for developers: This could be another reason why games need a timeout on their sprite 0 wait routines. The simplest timeout increases the sprite 0 jitter from 7 cycles (bit/bvc) to 9 (bit/bmi/bvc).
It's worth noting that I have never seen this glitch occur while playing Vs. Super Mario Bros.--or any other game--on a real Vs. board, but it DOES occur in my MMC1 NES port of the game. Presumably, whatever PPU quirk is causing this was fixed in the RGB PPU (or it's caused by some other component that's different/not present on a Vs. board).
I'm glad this is finally getting some attention. I've wondered about this since I was a kid!
Yeah, whatever it is, it's some kind of flaw within the PPU. I don't think the cart connector or the cartridge has anything to do with it.
It seems that it's only Loopy_V getting corrupt, and not Loopy_T. That's why it's only a single scanline, and not the entire bottom part of the screen.
Does Kid Icarus have some kind of sprite zero detection? Again, I saw this scanline glitch on the horizontal segments of that game too, and I don't believe it uses sprite zero detection for anything.
The only way it would interact with sprite 0 is if the glitch caused the background under sprite 0 to shift such that the intended pixels are not opaque. A timeout would keep the game from freezing in that instance.
tepples wrote:
I'm thinking the timing occasionally creates a bus conflict (or some other glitch) when reloading the horizontal scroll bits at x=257. Bus conflicts in NMOS tend to use AND logic, as seen in SAX instruction and
discrete logic mappers, which produces a preponderance of zero bits. Games that don't scroll horizontally are already more likely to leave a zero in the horizontal scroll bits.
Can you elaborate? I don't think SMB is rewriting PPU registers mid-frame, or any registers for that matter. Do you mean a conflict in mapper PRG bank switching putting some spikes on the power rails and affecting the PPU?
Quote:
Implication for developers: This could be another reason why games need a timeout on their sprite 0 wait routines. The simplest timeout increases the sprite 0 jitter from 7 cycles (bit/bvc) to 9 (bit/bmi/bvc).
Huh, timeout? You're suggesting that sometimes it doesn't catch sprite hit and this causes the glitch?
BTW, can't you use a bit/beq loop followed by a bmi timed_out to keep 7-cycle latency?
blargg wrote:
tepples wrote:
I'm thinking the timing occasionally creates a bus conflict (or some other glitch) when reloading the horizontal scroll bits at x=257. Bus conflicts in NMOS tend to use AND logic, as seen in SAX instruction and
discrete logic mappers, which produces a preponderance of zero bits. Games that don't scroll horizontally are already more likely to leave a zero in the horizontal scroll bits.
Can you elaborate? I don't think SMB is rewriting PPU registers mid-frame, or any registers for that matter. Do you mean a conflict in mapper PRG bank switching putting some spikes on the power rails and affecting the PPU?
No, I'm just saying that in NMOS, conflicts tend to cause things to become 0 more often than 1, and you discovered that bank switching is an example of this tendency to become 0. SMB1 doesn't use bank switching at all.
Quote:
Quote:
Implication for developers: This could be another reason why games need a timeout on their sprite 0 wait routines. The simplest timeout increases the sprite 0 jitter from 7 cycles (bit/bvc) to 9 (bit/bmi/bvc).
Huh, timeout? You're suggesting that sometimes it doesn't catch sprite hit and this causes the glitch?
No, the other way around. I'm saying the glitch causes it not to catch sprite hit, and games need to be prepared for a failure to catch sprite hit.
Quote:
BTW, can't you use a bit/beq loop followed by a bmi timed_out to keep 7-cycle latency?
Good point.
Question for anyone who understands Visual 2C02: Would writing to $2000 on one specific cycle (possibly only on one PPU/CPU clock alignment) while the bits in the PPU address are updated cause this?
The last thing Super Mario Bros. does in its NMI routine is write to $2000 to turn the NMI back on. If you mark every write to $2000 by changing the tint bits, you can see that this tends to occur about halfway down the screen (approximately where the glitchy line would be). Zelda does something similar when scrolling horizontally.
Nice, next chance I get I'm going to see if I can continuously trigger this effect. That's got to be it.
I managed to consistently trigger it at a spot in 1-2 towards the end. In about a year I can have a video uploaded, given my current connection speed...
Edit: here it is:
http://www.youtube.com/watch?v=P6DAWhLz ... e=youtu.be
If Grapeshot's hypothesis turns out true, it reminds me of the bug causing missed frames when spinning on bit 7 of $2002. Then perhaps one way to avoid it is not to write to $2000 except near a scroll split, and instead use some other way to work around NMI-in-NMI. If, as Drag claims, it is less common in later games, that could be three things: later games using SMB3-style horizontal mirroring with horizontal scrolling (and thus a bus conflict on bit 10 of v has no effect), later games doing less processing in the NMI handler as opposed to the SMB1-style "super loop", or just ignoring nested NMIs in software. But if $2000 write conflicts are the cause, sprite 0 spin waits would be less affected.
Statistics would be appreciated for what games turn NMI off and on mid-frame.
I believe I noticed this in Zelda 2, perhaps someone wants to check if it's programmed this way as well?
MottZilla wrote:
I believe I noticed this in Zelda 2, perhaps someone wants to check if it's programmed this way as well?
Yes it is.
Happens merely writing $01 to $2000 repeatedly in an infinite loop with vertical mirroring. Next to nail down the timing.
Attachment:
ppu 2000 glitch.jpg [ 55.2 KiB | Viewed 10509 times ]
EDIT: Seems that a write around pixel 255 is the cause. H/V scroll positions are irrelevant. Only occurs for two of the four CPU-PPU alignments (the two middle ones in the table I posted a while back, one of which is the "preferred" one). Hmmm, and only occurs reliably for preferred alignment (third in table).
Also, only occurs when write is at one particular dot. Not more than one. Clear now why it occurs so rarely in SMB, since you can only even hit this dot on every third scanline.
Maybe someone can probe this with Visual 2C02 now to see what's happening and exactly which dot it occurs on.
tepples wrote:
If Grapeshot's hypothesis turns out true, it reminds me of the bug causing missed frames when spinning on bit 7 of $2002. Then perhaps one way to avoid it is not to write to $2000 except near a scroll split, and instead use some other way to work around NMI-in-NMI. If, as Drag claims, it is less common in later games, that could be three things: later games using SMB3-style horizontal mirroring with horizontal scrolling (and thus a bus conflict on bit 10 of v has no effect), later games doing less processing in the NMI handler as opposed to the SMB1-style "super loop", or just ignoring nested NMIs in software. But if $2000 write conflicts are the cause, sprite 0 spin waits would be less affected.
Statistics would be appreciated for what games turn NMI off and on mid-frame.
I would expect most games to use NMI to just set a flag and do the actual processing outside the interrupt. That's the usual way to handle vblank timing on most systems, really (though of course that didn't stop many programmers from doing it in other ways). Note that this means the interrupt itself would be most likely less than half a scanline long, it's just write a single value to RAM and then return =P Of course this would also happen in vblank so it would never be visible.
What Sik is referring to is an NMI handler like that used in Concentration Room, Lawn Mower, Thwaite, and Zap Ruder:
Code:
nmi:
inc nmis
rti
In before tokumaru points out that doing VRAM updates, the music engine, and the status bar split
in the NMI handler makes sure that your music doesn't slow down and your sprite 0 split doesn't have a seizure even if the game slows down.
tepples wrote:
In before tokumaru points out that doing VRAM updates, the music engine, and the status bar split
in the NMI handler makes sure that your music doesn't slow down and your sprite 0 split doesn't have a seizure even if the game slows down.
Oh, you know me too well! I almost clicked the "quote" button as soon as I read the first sentence in Sik's post. But Yeah, this does make a huge difference in NES games IMO. It's as simple as this: if you don't interrupt the game logic for crucial PPU/APU updates that are timed from VBlank, things will screw up if there's any lag. The flag method is perfectly fine if you're absolutely sure the game logic will never take longer than a frame to finish (i.e. the entire game has no lag).
From looking at the SMB1 screenshot, it appears as if the X nametable bit for scrolling becomes 0, and no other bit of X scrolling is affected at all.
So... writing to $2000 mid-frame near the end of the scanline messes up the horizontal scroll for the next scanline, is that it?
I'm with tokumaru w.r.t. what to put in your NMI.
tokumaru wrote:
So... writing to $2000 mid-frame near the end of the scanline messes up the horizontal scroll for the next scanline, is that it?
Yes, if you're not using horizontal mirroring and are writing an odd value (bit 0 set) to $2000.
I see... so only the highest scroll bit is affected?
Hmmm, yeah. Shows how familiar I am with using the PPU in practice. I wonder whether the latch for bit 0 (along with the others) momentarily clears the bit, then lets the written bit re-set it.
I also wonder whether this same thing can occur for vertical when there's a $2000 write at the beginning of the frame when the V counter is initialized. I forget whether V is even reloaded once per frame or it's just implicit in the $2006 address.
T is repeatedly copied to V during the sync pulse after the pre-render scanline.
blargg, do you have a test ROM that could be run on a PowerPak? I played through the entirety of Vs. Super Mario Bros. on my top-loader yesterday, and didn't notice any scanline glitches. I'd like to see if this was fixed in late PPU/motherboard revisions or something.
tokumaru wrote:
tepples wrote:
In before tokumaru points out that doing VRAM updates, the music engine, and the status bar split
in the NMI handler makes sure that your music doesn't slow down and your sprite 0 split doesn't have a seizure even if the game slows down.
Oh, you know me too well! I almost clicked the "quote" button as soon as I read the first sentence in Sik's post. But Yeah, this does make a huge difference in NES games IMO. It's as simple as this: if you don't interrupt the game logic for crucial PPU/APU updates that are timed from VBlank, things will screw up if there's any lag. The flag method is perfectly fine if you're absolutely sure the game logic will never take longer than a frame to finish (i.e. the entire game has no lag).
The problem is that if I understand correctly Super Mario Bros does the game logic
within NMI, effectively not being any better than just polling when vblank happens.
Super loop has 3 cycle latency (forever: jump forever). Polling has 6 (wait: cmp nmis beq wait). Setting up CPU-cycle-counting IRQ sources, such as VRC, FDS, and FME-7, may benefit from such a latency win to keep raster effect register changes in hblank.
BMF54123 wrote:
blargg, do you have a test ROM that could be run on a PowerPak? I played through the entirety of Vs. Super Mario Bros. on my top-loader yesterday, and didn't notice any scanline glitches. I'd like to see if this was fixed in late PPU/motherboard revisions or something.
It only occurs reliably on one of the four possible CPU-PPU alignments at power/reset. For one other alignment, it occurs sometimes when you do the write at the particular dot on a scanline.
Here's the test ROM. Keep pressing reset until you get the glitch (get the appropriate CPU-PPU alignment)
tepples wrote:
Super loop has 3 cycle latency (forever: jump forever). Polling has 6 (wait: cmp nmis beq wait). Setting up CPU-cycle-counting IRQ sources, such as VRC, FDS, and FME-7, may benefit from such a latency win to keep raster effect register changes in hblank.
When I did Chu Chu Rocket, I did IRQ setup stuff inside NMI. I'd imagine doing it at the beginning of NMI would have the least amount of jitter. No need for the main thread to do interrupt stuff.
blargg wrote:
Here's the test ROM. Keep pressing reset until you get the glitch (get the appropriate CPU-PPU alignment)
On a Famicom resetting isn't enough, you need to power cycle the console.
This is because the PPU isn't reset on the Famicom, right?
MARIO CHIP 1 wrote:
blargg wrote:
Here's the test ROM. Keep pressing reset until you get the glitch (get the appropriate CPU-PPU alignment)
On a Famicom resetting isn't enough, you need to power cycle the console.
This is because the PPU isn't reset on the Famicom, right?
Yep.
MARIO CHIP 1 wrote:
blargg wrote:
Here's the test ROM. Keep pressing reset until you get the glitch (get the appropriate CPU-PPU alignment)
On a Famicom resetting isn't enough, you need to power cycle the console.
This is because the PPU isn't reset on the Famicom, right?
Same thing applies (PPU not reset) on the toploader NES as well.
So on a cold boot, there is a 1/4 chance that the glitch will possibly occur, and other times the game will play through without it ever happening simply from the lack of appropriate alignment?
From what I can tell, pretty much. One other of the four alignments can occasionally cause it, but nothing like what the other alignment does, where it's 100% reliable (the test ROM does it every three scanlines).
Now that folks have figured it out...
It's
very important that this be emulated. It is key to accurate console emulation, and is an absolute deal-breaker. I can't wait to use emulators that "act wonky" 1/4th of the time, with emulator authors saying "if you don't like this fact, use the Power Off/Power On feature until things stop acting wonky". I look forward to such future emulators.
(Note: Yup, I'm being a sarcastic dick intentionally. But at least it was discovered that this is hardware-level behaviour in the PPU itself, and not like what I originally speculated. So despite me being a dick, I can happy say I was wrong in
my assumption/belief of what the issue was. No shame in admitting I was wrong, but there's also no shame in me admitting the driving force to emulate this is total, absolute, complete OCD. Just Say No!)
I'd still like to see someone reproduce it on Visual 2C02 so we can understand why it happens, and better understand the conditions it occurs under.
blargg wrote:
tokumaru wrote:
So... writing to $2000 mid-frame near the end of the scanline messes up the horizontal scroll for the next scanline, is that it?
Yes, if you're not using horizontal mirroring and are writing an odd value (bit 0 set) to $2000.
What about setting bits 10 and 11 of t with the first write to $2006? Or is this somehow only a problem with $2000?
koitsu wrote:
Now that folks have figured it out...
It's
very important that this be emulated. It is key to accurate console emulation, and is an absolute deal-breaker. I can't wait to use emulators that "act wonky" 1/4th of the time, with emulator authors saying "if you don't like this fact, use the Power Off/Power On feature until things stop acting wonky". I look forward to such future emulators.
(Note: Yup, I'm being a sarcastic dick intentionally. But at least it was discovered that this is hardware-level behaviour in the PPU itself, and not like what I originally speculated. So despite me being a dick, I can happy say I was wrong in
my assumption/belief of what the issue was. No shame in admitting I was wrong, but there's also no shame in me admitting the driving force to emulate this is total, absolute, complete OCD. Just Say No!)
You're thinking entirely from a casual user's/gamer's point of view. As a developer, I sure would like the option of switching between different power on configurations (no need to have it randomized!) and being able to detect possible glitches, even if rare, also on emulators. Besides, nothing wrong with OCD.
thefox wrote:
koitsu wrote:
Now that folks have figured it out...
It's
very important that this be emulated. It is key to accurate console emulation, and is an absolute deal-breaker. I can't wait to use emulators that "act wonky" 1/4th of the time, with emulator authors saying "if you don't like this fact, use the Power Off/Power On feature until things stop acting wonky". I look forward to such future emulators.
(Note: Yup, I'm being a sarcastic dick intentionally. But at least it was discovered that this is hardware-level behaviour in the PPU itself, and not like what I originally speculated. So despite me being a dick, I can happy say I was wrong in
my assumption/belief of what the issue was. No shame in admitting I was wrong, but there's also no shame in me admitting the driving force to emulate this is total, absolute, complete OCD. Just Say No!)
You're thinking entirely from a casual user's/gamer's point of view. As a developer, I sure would like the option of switching between different power on configurations (no need to have it randomized!) and being able to detect possible glitches, even if rare, also on emulators. Besides, nothing wrong with OCD. ;)
Figuring out why said visual artefacts happen is cool -- hooray, people now have a better understanding, are on the right track, etc. -- but I have yet to see a reason for emulating this. That was the justification initially given (from one person anyway).
Think of it this way -- the developer of the games and the console itself didn't give a shit (the examples in this thread are proof), so why should we (casual gamers OR developers alike)? Just my take on it, others obviously have a different opinion, which is (honestly/truly!) cool. I just don't see it that way is all, and it often pisses me off when I see people throw in the "it's for emulation purity!" card (as if a nuclear holocaust is going to happen and destroy all the NES/Famicoms on the planet and we'd never be able to see this wonderfully important visual artefact otherwise).
How many emulators out there emulate the noise floor and/or environmental interference in the audio signal? What about the same problems in the video signal? There are some things that almost universally detract from the experience, while adding nothing of value. Another example that came up with Famitracker: correct emulation of the N163's 8-channel audio is an option most people would not take, given the choice.
However, with this particular thing, I don't see any good reason for an accuracy-oriented emulator not to implement this, unless it makes a real impact on CPU demand, which I doubt. It's something that could affect some of us trying to do some very specific timings, and we could benefit from this particular failure case being apparent. It's important to know the real capabilities, limitations, and failure modes of the hardware if you're developing for it.
An emulator that is only oriented toward playing games well need not bother, obviously.
Ideally all these things are options. In real cases, there may be a unpleasant tradeoff involved in implementing the option, so a judgement call is made based on who/what the emulator is for.
I personally would prefer to have the choice of just the two extremes: "glitchiest" and "prettiest". Even if all the bad behaviors don't happen on the same cpu-ppu alignment.
koitsu wrote:
it often pisses me off when I see people throw in the "it's for emulation purity!" card (as if a nuclear holocaust is going to happen and destroy all the NES/Famicoms on the planet and we'd never be able to see this wonderfully important visual artefact otherwise).
There are some precedents for that though. Good luck getting your hands on a working Vectrex or certain other classic machines. And it's entirely possible that after a few decades, governments might ban imports of "vintage" electronic equipment manufactured prior to the adoption of WEEE/RoHS.
lidnariq wrote:
I personally would prefer to have the choice of just the two extremes: "glitchiest" and "prettiest".
Such an option might make it easier for developers.
I can think of one use for this, anyway. If you set up a Sprite 0 hit on the left side of that pattern in the test ROM, you can use this glitch to detect 1 of the clock alignments on real hardware within a frame, instead of with several seconds of timed loops.
Grapeshot wrote:
I can think of one use for this, anyway. If you set up a Sprite 0 hit on the left side of that pattern in the test ROM, you can use this glitch to detect 1 of the clock alignments on real hardware within a frame, instead of with several seconds of timed loops.
That and a sprite 0 hit allows this glitch to move from "cosmetic, no long-term-effect" to "CPU can detect it and act differently".
Though, I found a while back that the four alignments can be differentiated with trivial code that does double writes to $2007 and then sees what was written. I ought to post a ROM to see whether that works on other NES consoles reliably.
koitsu wrote:
Figuring out why said visual artefacts happen is cool -- hooray, people now have a better understanding, are on the right track, etc. -- but I have yet to see a reason for emulating this. That was the justification initially given (from one person anyway).
Think of it this way -- the developer of the games and the console itself didn't give a shit (the examples in this thread are proof), so why should we (casual gamers OR developers alike)? Just my take on it, others obviously have a different opinion, which is (honestly/truly!) cool. I just don't see it that way is all, and it often pisses me off when I see people throw in the "it's for emulation purity!" card (as if a nuclear holocaust is going to happen and destroy all the NES/Famicoms on the planet and we'd never be able to see this wonderfully important visual artefact otherwise).
I think it's great that this glitch is identified and explained. When I noticed it recently, I had no idea what was up with it. I was worried it was something about the age of my console. While I am not opposed to someone emulating this glitch, I could care less if it was emulated. It's more important that this glitch was noted and explained. Overall it's just a minor footnote.
Not emulating quirks like these is the reason we have homebrew games and demos that don't work properly on real hardware.
MottZilla wrote:
I think it's great that this glitch is identified and explained. When I noticed it recently, I had no idea what was up with it. I was worried it was something about the age of my console. While I am not opposed to someone emulating this glitch, I could care less if it was emulated. It's more important that this glitch was noted and explained. Overall it's just a minor footnote.
Same here. I thought it was from my console overheating or something. It already has issues when I go a long time without turning it on, so it wouldn't be farfetched.
I don't think the world is going to end if this scanline bug isn't emulated (it's MUCH less of a problem compared to, say, the sprite bug), but it most certainly should be documented either way, since it can hypothetically cause sprite 0 hits to fail.
I don't think there will ever be a substitute for testing on hardware with a "real" cartridge that doesn't run its own bootloader before one's code.
That, and testing on the real hardware is just more genuine on the whole anyway.
BMF54123 wrote:
Not emulating quirks like these is the reason we have homebrew games and demos that don't work properly on real hardware.
Maybe in the early days, but most of the crap that is made today works on real hardware as opposed to most old stuff which doesn't. I find the main thing isn't NES quirks today, but mapper quirks like indirect indexed write on MMC1 and just crap like that.
I'd agree that the main reason for older demos and hacks not working are not obscure glitches being unemulated, but much more basic things. Like BS-Zelda on SNES when it was first hacked, didn't respect the simple VBlank rules of writing VRAM. ZSNES atleast at the time, did not care if a game tried to write VRAM during rendering and it would work just fine.
Much older NES emulators and other emulators didn't have the benefits of much more detailed and correct hardware info.
3gengames wrote:
BMF54123 wrote:
Not emulating quirks like these is the reason we have homebrew games and demos that don't work properly on real hardware.
Maybe in the early days, but most of the crap that is made today works on real hardware as opposed to most old stuff which doesn't. I find the main thing isn't NES quirks today, but mapper quirks like indirect indexed write on MMC1 and just crap like that.
I can think of a way for this quirk to affect severely homebrew though: have a sprite 0 that's just one pixel high, and have it happen right in the scanline that's glitched. If missing the sprite 0 hit happens the game to misbehave completely (especially if it has long term effects), now suddenly the game will crash randomly on real hardware without seemingly any explanation whatsoever, and will be a debugging nightmare. With emulators at least you can catch it earlier (no need to deploy new builds to the cartridge all the time and hope the power cycle is right).
But is that really going to happen as when you are waiting for that sprite hit zero, are you going to be writing $2000? I don't think so. Besides look at Battletoads, it uses a single pixel sprite atleast in level 2.
MottZilla wrote:
But is that really going to happen as when you are waiting for that sprite hit zero, are you going to be writing $2000? I don't think so. Besides look at Battletoads, it uses a single pixel sprite atleast in level 2.
Yes it could, if you write to $2000 (to re-enable your NMI after your NMI routine finishes, for example) just before the scanline sprite 0 is on and glitch it, and then spin on $2002 right afterwards.
This has a 1/357368 chance of happening (1/4 for CPU/PPU alignment, 1/262 for correct scanline, 1/341 for correct pixel), so it's pretty obscure, so the programmer is free to ignore it, but that doesn't make it not happen.
I think the chance is higher. Remember, it can happen on any scanline if the write is on the correct dot. If your game is writing to $2000 after VBL, it's going to be at a fairly consistent place. It may vary by say 100 cycles, with one of them being the bad dot. Since it's due to the 9th bit of X scroll being 1, there's a 1 in 2 chance there. With normal rendering the write can only occur at the bad spot every other frame, so 1 in 2 chance. This totals 1 in 400 chance each frame. When the game powers up on the bad alignment (which is pretty much certain after playing on a few days), and the game running millions of frames over weeks of play, it's bound to happen thousands of times.
I also saw this in Zelda 2 a couple years ago, and was wondering what the hell was going on. IIRC it was even right at the beginning (in the temple where Zelda sleeps). It would be a little interesting to hear if people notice this in other games now.
But I guess this is a good reason to be careful about mid-frame $2000 writes just the same as $2001. If one needs to disable recursive NMIs, it's usually better do it it in software. In practice, disabling NMI hardware I'd expect would freeze the music/sfx for that frame.
Code:
inc nmi_recursive_depth
pha
; play music, check the state of nmi_recursive_depth, possibly skip all other processing
pla
dec nmi_recursive_depth
rti
Another time $2000 often is written mid-frame is when changing BG tiles after a sprite zero hit or timed wait. But obviously this should be happening on a known position.
blargg wrote:
I think the chance is higher. Remember, it can happen on any scanline if the write is on the correct dot. If your game is writing to $2000 after VBL, it's going to be at a fairly consistent place. It may vary by say 100 cycles, with one of them being the bad dot. Since it's due to the 9th bit of X scroll being 1, there's a 1 in 2 chance there. With normal rendering the write can only occur at the bad spot every other frame, so 1 in 2 chance. This totals 1 in 400 chance each frame. When the game powers up on the bad alignment (which is pretty much certain after playing on a few days), and the game running millions of frames over weeks of play, it's bound to happen thousands of times.
You're absolutely right. My figure assumed uniform distribution because it's been probably 3 years since my college Statistics class, and I've forgotton how to do probability curves.
Either way, you can
completely avoid this glitch by not writing to $2000 outside of vblank, unless it's part of a timed screen-split, in which case you can simply time it away from the problem pixel. You'd probably be timing it anyway to make the split nice and neat.
Drag wrote:
Either way, you can
completely avoid this glitch by not writing to $2000 outside of vblank, unless it's part of a timed screen-split, in which case you can simply time it away from the problem pixel. You'd probably be timing it anyway to make the split nice and neat.
What if you miss the calculation and happen to do the split right in the wrong spot? Especially if whatever effect you want happens to look just fine otherwise (e.g. if done in the nearby dots it'd look right).
Sik wrote:
What if you miss the calculation and happen to do the split right in the wrong spot? Especially if whatever effect you want happens to look just fine otherwise (e.g. if done in the nearby dots it'd look right).
Then you get a flickery scanline. The same problem you get when you time a pair of $2006 writes incorrectly, and just like a badly timed $2006 split, the remedy is to adjust the timing.
I think it should be emulated, personally, though I won't flame those who disagree. I do have one question, the answer of which may nudge some opinions towards or away from it - does anyone here with more experience on the NES think that this glitch could be harnessed and made useful for some sort of demo effect? I love seeing things done with hardware that can't normally be done, such as the entirety of the modern C64 demoscene
If it can be harnessed and "tamed", then it should be emulated, full stop. Perhaps if nothing else, in today's time where no emulator does emulate it, it could be used as a cheap way to programmatically determine whether one is on an emulator or not? Or is there another, better way to do so? I'm not talking about for "anti-emu" code that stops a ROM from running, but perhaps someone might want to make a demo with some slight differences in text or graphics depending on whether it's an emulator or hardware.
An emulator could defeat this "cheap way" by emulating an NES that never powers on in the alignment that produces this glitch.
LocalH wrote:
think that this glitch could be harnessed and made useful for some sort of demo effect?
I'd be surprised. At best it can only be used every 3 scanlines, because it depends on the write to $2000 happening at a very precise PPU phase. At worst, it can't be used at all. And an equivalent effect should be possible with timed writes to $2006 on two successive scanlines.
What struck me about this is that the "glitch" spans no more than one scanline.
The images posted by LocalH show the full double-240 (interlaced) scanlines.
Shrunken by a factor of 1.25, 384p for 480i — a quarter of the image interpolated out.
(Add compression loss; the footage presented is not 100% reliable.)
Calculated, lines 171-172 of the "frame 12657" PNG = a scanline bet. 106-107.
Seeing that this line has a fixed horizontal scroll, I used FCEUX to look at the name table for similarities— nothing out of the ordinary; just two sections of the map. So I looked at the execution range during this part of World 1-2.
The cart (using the PAL version) uses vertical mirroring, and only sets the x-scroll (for the scoreboard).
The NMI routine, with no interrupt pins held low, quickly sets the x-scroll to zero (from $80A6 on, with a double-STA$2005 subroutine), and continues well past the Sprite-0 hit code (where the x-scroll is reset to the game scroll). After Spr-0 hit, you have updates, $73F and $740 are piped directly to $2005 at scanline 30.
The NMI routine ends here at scanline 91, but I've seen it go up to 114, which fits with the PNG image. This would also mean, depending on game processing, inconsistencies in the vertical offset of the "glitch" line— appears to be the case.
Code:
LDA $2002 A:04 X:05 P:A4 Y:04 L: 91 -- clears status reg bit #7
PLA A:51 X:05 P:24 -- pushed from an earlier routine
ORA #$80 A:11 X:05 P:24 -- hold NMI enable bit on
STA $2000 A:91 X:05 P:A4 -- forces PPU register changes
RTI A:91 X:05 P:A4
JMP * P:A5
If it were to occur after the JMP loop started, _only_ a PPU NMI could do that
- not easy to be some stray NMI signal, as the code would be updating parts of the game 50/100% more often/faster. I've not seen that in the videos. If the NMI is delayed with overwhelming processing, the game skips, with the scoreboard scroll not zeroed, sound not updated (producing sfx artifacts).
This occurs easily in SMB.
I reject the possibility of these frames being merely due to loose pin contacts, because one bad bit would mean more than one bad scanline (and easily 8 lines consecutive). No single binary digit would result in a glitch at a scanline 107 (or actual equivalent with the NT daisy chain address).
It's been figured that this is due to the $2000 write, performed mid-scanline, forcing an update of PPU registers, registers in a pipeline that are only fully fixed on the next line; the PT-NT pipeline tracking is a little flakey/lossy here. (Power-on synchronizes it.) Every three lines reflect the CPU's 3/3.2 per pixel cyc. to the indivisible 341 per line (it takes three lines to come to the same pixel offset without a forced interrupt, as seen in full_nes_palette). It's a sudden change of pipeline flow right at the point of a calculation phase, where the low-level read address is calculated.
Super Mario Bros. is a bit cheap to carry a one-size-fits-all routine to execute things normally only executed before gameplay. This "glitch" should also occur in other games that write $2000 mid-scanline like this.
On emulating this, I would expect Nestopia, which has a very close PT-NT pipeline, to bring out the every-3-lines effect of ppu_2000_glitch.nes, but it doesn't. It shouldn't be hard to make the code change there.
I just played a little bit of Super Mario Bros on my NES, and now I can't un-see this.
Dwedit wrote:
I just played a little bit of Super Mario Bros on my NES, and now I can't un-see this.
My first NES I bought 8 years ago for $12 because it was broken (someone had soldered some stupid hack garbage into it). I got home, removed the non-working hack, and after much cleaning I got it booting Super Mario Brothers. Having played plenty of SMB on both Game Boy Color (Mario DX) as well as in emulators, the glitchy line stood out to me right away, as I wondered whether this NES was on its last legs. Years later I would get a Famicom and see the same bug.
If it happens around dot 256 on the scanline, then it's almost certainly due to updating loopy t (specifically bit 10 - the horizontal nametable bit) at the same time that it's being copied into loopy v. Looking in Visual 2C02, the only extraordinary thing that would happen in that situation is that you would have a direct logic path from the first data bus line (line 0, since $2000:0 is the relevant bit) to bit 10 of loopy v (both the write_2000_vramaddr signal and the copy_vramaddr_hscroll signal would be high, connecting _io_db0 to vramaddr_t10 to vramaddr_v10).
Not sure how that could cause the value to disappear though, since the data bus line should still have the "right" value. Maybe there's some analog thing going on, or maybe it has something to do with M2 out from the CPU having a >50% duty cycle, while the actual write only happens during the "true" last 50% (the PPU "sees" the write while M2 is high).
Edit: It works exactly the same for updates to loopy t via $2005 and $2006 by the way. The "connect data bus to relevant t bits" signal in all cases is M2 && address == $200x, where x is 0, 5, or 6, and the "connect horizontal bits of t to v" signal is triggered at the end of the scanline.
Some of the latches for the $2000/$2001 flag bits have a transistor on their outputs to prevent the value from being read while the bit is being written (though the old value will probably still be read during the write due to capacitance on the output wire). Looks like e.g. the monochrome bit and the blue emphasis bit doesn't have such transistors though (or doesn't use them at least, you get a cut-off wire on the other side).
Here's an experiment that would be interesting to try: Set either the monochrome bit or the blue emphasis bit to 1 in a loop during rendering. If that produces colored/un-emphasized spots on the screen, then it's a problem with reading the value at the wrong time while it is being written.
Bit of a bump, but I seem to have fixed this problem in my NES port of Vs. Super Mario Bros. All I had to do was modify the game so that it never disables NMI, but uses a flag in RAM instead (as most newer games seem to do):
- When NMI hits, check this flag; if it is clear, proceed with the usual NMI routine, otherwise RTI
- Flag is set at beginning of NMI routine (main game loop) and cleared at the end
Voila, no more glitchy line. Of course, this doesn't fix the status bar flickering when the game slows down, since that would be a somewhat more involved hack, and difficult to do without a disassembly of the Vs. version (and I didn't really want to alter the game's normal behavior anyway, for accuracy's sake).
Yes, I agree that is a safe/easy/simple way to prevent re-entrant NMI. It's what I now do in my own projects.