Need some help. Working on a game for the nesdev competition and we're seeing a glitch. Basically, when a level starts, we disable sprites, setup the level, then enable sprites and off we go. This works fine on emulator (FCEUX) and powerpak, but we're seeing a weird glitch on everdrive. Basically, for a brief moment (presumably one frame), the sprites show up all wacky. I assume its the sprites because the background has already been loaded and looks fine. Its also weird because the first 2 sprites (the top of the frog) are always in the correct place and some of the health is in the right place, and there are some consistencies in each levels glitched state... like some health seems to be changed to r's and some of them are shifted to the same spot halfway across the screen... but which ones aren't always consistent. There are a couple other consistent ones, but I'm not sure how much that matters.
The difficult part is I can't debug on my everdrive. I did try to recreate the "screen garbage" on emulator by never disabling the sprites to test if it was correct sprite states but being displayed too early on everdrive, but when I did, the sprites never looked out of place. I wasn't sure what to try after this. I sort of assume its either something sketchy the everdrive does (although I couldn't find any meaningful differences between everdrive and powerpak) or we're interacting innappropriately with the PPU and some way the Everdrive works exposes it... but I couldn't figure anything out by walking through the code and the PPU is confusing
. We're not doing anything with mappers, if that helps.
Our code is here:
https://github.com/mitch3b/ChickenOfThe ... 1583-L1701 It's a lot to look at, but I've linked to the section that we use to load the level so it might be useful.
Here is the video:
https://clips.twitch.tv/AgileGracefulYakinikuPrimeMe that shows the first 2 levels. The first level the glitch happens in real time and the second level I pause buffered so it loads into the first frame and stays that way. I've included screen shots including what the second level looks like when loaded, although its a little dark (sorry about that, I just plucked an image from the paused twitch video).
Any ideas to recreate or theories would be helpful.
Thanks!
No. The only emulator I've tried is FCEUX although the person I'm developing with (link_7777) I THINK uses EmuHawk and didn't see the issue either. Is there a reason why Mesen might show something different than FCEUX or does it have a feature that might help? (BTW, i'll try Mesen anyway bc I assume it'd be a better emulator to be using when I dev on my mac)
Mesen is more accurate overall. For example, it doesn't start RAM with all zeroes, which is more accurate.
Also it presents some more debugging features.
mitch3a wrote:
The only emulator I've tried is FCEUX
As a developer, you should not rely on a single emulator, specially if accuracy isn't one of that emulator's strongest suits. FCEUX is a great debugging tool, but a little lacking in accuracy.
Unfortunately I can't help much with C code, but if a ROM is included I will try to do a little debugging when I'm on my PC.
Attached the latest version. Just please don't go overboard on playing too far/gameplay feedback
Saving that for the competition and still hoping to make some significant changes before submission.
I had a similar bug before and it was because the OAM was not initialized properly and you would see some residual data at at startup. It could be the cause here.
It could be an issue with missing a frames worth of the DMA to the OAM (which we're doing), but in this case, because sprites are set and can't move, assuming nothing is corrupting it then the OAM should remain unchanged. The question would be (if that is the issue), why would it be getting corrupted on Everdrive and not powerpak/emu.
Different menu software using RAM differently, I guess. That and OAM DMA appears to use slightly different memory access timings compared to CPU code or data fetches, and early PowerPak revisions had to be modded with resistors to account for that.
Seems like we're starting to narrow down the issue. There are a handful of examples here that can corrupt the OAM so I'll probably step through one by one to verify if any are happening:
https://wiki.nesdev.com/w/index.php/Err ... nd_SpritesI'm also going to try Mesen to see if I can recreate the issue in an emulator. and take a harder look at PPU/OAM DMA writes... unfortunately the code isn't organized the best way to do this so it'll probably be a little error prone and take a chunk of time, but it definitely feels like it could expose my issue.
nesrocks wrote:
Mesen is more accurate overall. For example, it doesn't start RAM with all zeroes, which is more accurate.
Actually,
it does pre-initialise RAM with all zeros (that's the default). You can change/select this behaviour, however, but it's up to the user to change that: Options -> Emulation -> Advanced -> Default power on state for RAM: All 0s (Default), All 1s, or Random Values are the choices. I tend to recommend folks try all of them when testing a game they've worked on.
The subject of default RAM values on power-on has been discussed heavily in the distant and recent past, and there is no universal default that works for every game. Some games depend on certain values otherwise they act weird/different than on hardware, while others rely on the exact opposite/different values than the other games.
Anyway, let's not let this subject distract from the thread unless it turns out to be the root cause.
In general, testing on actual hardware still has to be done on real carts (read: not Everdrive, not PowerPak, etc.) for reasons demonstrated here. The implementation of those flash carts using native 6502 code and thus the NES's RAM can cause unexpected behaviour.
It's probably going to turn out to be some piece of code that is relying on values not initialised (incl. OAM), or possibly some PPU quirk ($2000/2005/2006), or timing thing. It almost always is on the NES.
Someone I know used to have an issue with Everdrive where sprites would mess up during lag frames in Zelda (IIRC, sprite RAM bits would get set). I believe this turned out to be a power supply issue, since the cart draws more power than is typical. During lag frames in Zelda, OAM DMA isn't performed, so maybe inadequate power was impacting OAM decay in some way. I notice here that your glitching is setting bits; for example, tiles from the health bar have been moved what looks like #$80 pixels to the right. So, seems similar.
The behavior of your code is that you do OAM DMA while rendering is off, and then enable rendering almost 30,000 cycles later. This should be plenty of time for decay. In fact, in Mesen, if you turn on decay emulation (Options -> Emulation -> Advanced -> Enable OAM RAM decay), you can see by viewing Sprite / OAM RAM in the Memory Tools that by the time you've enabled rendering, OAM has decayed. Mesen currently emulates this by setting decayed RAM to #$10 (and you get a frog head on the top left of the screen). You need to not wait a long time between OAM DMA and enabling rendering.
@sour: Speaking of which, could you make Mesen use random values on OAM decay? IIRC, the manual says they get set to random, I remember seeing a Mesen code comment saying they get set to a specific value, and then the code sets it to a different value. With everything at #$10, it's hard for me to see the glitching here because it goes under your pause icon.
As expected, the OAM was being corrupted on Everdrive so an extra DMA OAM was required (code fix here:
https://github.com/mitch3b/ChickenOfThe ... arm.c#L916). Didn't really investigate any more than giving it a try, but seemed to work and competition deadline approaching so I'll take it and move on. Thanks for the help. Always impressed by how quick/helpful you folks are!
Fiskbit wrote:
@sour: Speaking of which, could you make Mesen use random values on OAM decay? IIRC, the manual says they get set to random, I remember seeing a Mesen code comment saying they get set to a specific value, and then the code sets it to a different value. With everything at #$10, it's hard for me to see the glitching here because it goes under your pause icon.
Not sure why I wrote that they "randomly change" in the documentation.
This used to return $FF, which would hide all the sprites (that's what the comment says), but it was changed to $10 to make the sprites visible (and because it was a value that some people said their OAM bytes tended to decay to). Looks like I didn't change the comment to match.
I could change this to something even more obvious, though. E.g $05 multiplied by the sprite index, which in this case would result in this:
Attachment:
ChickenOfTheFarm_02_000.png [ 2.43 KiB | Viewed 7113 times ]
Pretty hard to miss, at the very least.
Also, there is a "Break on decayed OAM read" option in the debugger that will cause execution to break whenever a decayed byte is read by the PPU (which is the safest way to find these issues really). The only downside is that this requires keeping the debugger window opened.
That's certainly a lot more visible! I'm actually planning to do some OAM decay testing in the near future; I'd like to get an idea of how long it takes before decay occurs and how it tends to decay. Beyond just understanding it better, I'm curious if it could have utility (could it reasonably be used as a source for randomness?). For now, I'll be testing on Everdrive, which I can apparently expect to influence my results.
As far as emulation goes, for development, having realistic decay timings seems pretty important. High visibility might be helpful, but as you mentioned with the debugger feature, it's not necessary (though the debugger breaks on decayed reads even when sprite rendering disabled, which is of questionable utility and makes the feature pretty useless for this Chicken of the Farm ROM because of so many false positives). If flash carts do actually impact decay timings and/or behavior even on healthy power supplies, having an option to run with these more strict timings would also be helpful (though in the case of decay possibly being faster than vblank as I described with Zelda, I'm not sure what the right solution is. OAM DMA during lag frames is a bad idea because it can draw sprites that aren't fully written yet. This is especially noticeable in Metroid with the HUD).
For casual play, I like accuracy and would prefer standard realistic timings and realistic decay, especially given that it's possible a game could require this at some point if decay is sufficiently random.
(All of this depends on getting good data about how this tends to work across different consoles, of course.)
Fiskbit wrote:
I'm actually planning to do some OAM decay testing in the near future; I'd like to get an idea of how long it takes before decay occurs and how it tends to decay. Beyond just understanding it better, I'm curious if it could have utility (could it reasonably be used as a source for randomness?).
Early Famicoms and NESes don't let you read OAM at all, since OAMDATA ($2004) is a write-only register on those consoles, so even if it were a decent source of randomness, you wouldn't be able to use it on those consoles.
On the 2C02{A/B/C/D/E} you could theoretically still use sprite 0 hits to track decay.
But even then, the 2C07 never fully disables refresh and DRAM decay doesn't happen there.
Fiskbit wrote:
As far as emulation goes, for development, having realistic decay timings seems pretty important.
To be fair, I don't think anything beyond "the RAM will never decay in less than X milliseconds" is really necessary (and even then, I would expect this to vary to some extent from one unit to another?) And I don't believe OAM DMA needs to be run to preserve OAM RAM during lag frames? The PPU's regular rendering should keep the OAM refreshed, afaik.
Fiskbit wrote:
though the debugger breaks on decayed reads even when sprite rendering disabled, which is of questionable utility and makes the feature pretty useless for this Chicken of the Farm ROM because of so many false positives
This can be considered a bug - there's no reason to break the debugger because the PPU read a decayed byte if sprite rendering is disabled (unless the byte was read through OAMDATA) - I'll fix this.
In terms of existing licensed/bootleg games, it can probably be assumed that the majority of them do not have decay-related issues and are not affected by OAM decay. "Proper" OAM decay emulation is probably not really possible since it's a physical phenomenon that is essentially random - it's kept as an option in Mesen to help homebrew developers and romhackers, but since it would likely reduce compatibility with older romhacks/homebrew/etc, I don't really plan on ever enabling it by default.
lidnariq wrote:
But even then, the 2C07 never fully disables refresh and DRAM decay doesn't happen there.
Speaking of which, I discovered yesterday that nocash's Magic Floor game actually is affected by OAM decay (which makes sense since he most likely built it on a PAL NES). I had kept the decay option turned on after posting that screenshot and for a little while, it got me convinced that I had somehow broken the mapper's emulation with the code refactoring that I was doing.
I didn't know that the PAL NES never stops its OAM refresh cycle, even when rendering is off... Does that mean that the typical first program everyone makes for the NES (draw a background, set the palette, DMA the sprites and turn rendering on, without any regard for PPU synchronization) is very likely to end up with corrupted sprites?
Realistic decay timings would involve simulating the temperature of the chip since it turned on, and probably also accounting for the temperature of the environment around it. A room temperature setting on an emulator would be interesting, ha ha ha.
I think there was a story about Nintendo demonstrating a baseball prototype and using a can of compressed air to cool the PPU to keep the decay bug from showing in the demo.
rainwarrior wrote:
I think there was a story about Nintendo demonstrating a baseball prototype and using a can of compressed air to cool the PPU to keep the decay bug from showing in the demo.
That or it was a story by Andrew Davie on the NESdev Yahoo! Group about proving to Nintendo that one of the consoles used for lot check on
The Three Stooges was malfunctioning.
@Nicole/lignariq: Yeah, earlier hardware not supporting $2004 reads is an unfortunate limitation, for sure. lidnariq's sprite 0 trick is a neat workaround I hadn't thought of before.
@Sour: The lag frame case I was talking about was specifically with sprite corruption I observed during lag frames with Zelda being run on an Everdrive, which was fixed by using a different AC adapter. Whether this was an issue with faster decay or not, I don't know, but mitch3a's problems only being noticeable on Everdrive definitely makes me wonder.
Defaulting to decay-off seems reasonable particularly given the older homebrew constraint.
@tokumaru: My understanding based on results from
this thread is that PAL enables OAM refresh during scanlines 24-69 to prevent decay during PAL's longer vblank period, during which OAM is not accessible by software (this is why it's wise to do OAM DMA first on PAL-specific software before other vblank work). It does this even when rendering is disabled. Whether you get decay on PAL with rendering disabled depends on whether the rest of the frame is long enough for decay, which seems likely to me given that it's a much larger time window than what this forced refresh was added to cover.
If you did OAM DMA with rendering off on PAL with no regard for timing, there's a nontrivial chance it would overlap that forced refresh region, which I would expect to cause corrupted sprites.
Fiskbit wrote:
@tokumaru: My understanding based on results from
this thread is that PAL enables OAM refresh during scanlines 24-69 to prevent decay during PAL's longer vblank period, during which OAM is not accessible by software (this is why it's wise to do OAM DMA first on PAL-specific software before other vblank work). It does this even when rendering is disabled. Whether you get decay on PAL with rendering disabled depends on whether the rest of the frame is long enough for decay, which seems likely to me given that it's a much larger time window than what this forced refresh was added to cover.
According to the
wiki, there's forced refresh after 21 scanlines of vblank, but also during the visible part of the picture, even if rendering is disabled. This means that OAM decay *never* happens in PAL systems, because except for those 21 scanlines, OAM is constantly being refreshed.
Quote:
If you did OAM DMA with rendering off on PAL with no regard for timing, there's a nontrivial chance it would overlap that forced refresh region, which I would expect to cause corrupted sprites.
The problem is that apparently the forced region is most of the frame (291 scanlines out of 312, or 93% of the time), so there's a huge chance that the typical beginner's program (which does a single OAM DMA whenever) will not work correctly on PAL. Has this ever been reported by anyone before?
Fiskbit wrote:
My understanding based on results from
this thread is that PAL enables OAM refresh during scanlines 24-69 to prevent decay during PAL's longer vblank period, during which OAM is not accessible by software [...] Whether you get decay on PAL with rendering disabled depends on whether the rest of the frame is long enough for decay, which seems likely to me given that it's a much larger time window than what this forced refresh was added to cover.
You would think that. And yet ... decay hasn't been observed on 2C07 ever.
So it can't
just be evaluating OAM = refreshing DRAM during those specific scanlines.
Interesting. So, the wiki references
this thread where thefox couldn't get OAM decay to occur on PAL, citing this as the source for having 20 safe scanlines (which was considered approximate). I don't know if that's also the source for the statements about refresh occurring everywhere but early vblank or not. Looking at the thefox's results from Sour's tests again, with rendering disabled, there were tests and test runs where reads outside of vblank worked, and ones where they didn't. It looks like we never got clarity on why this was the case, so there appears to be further room for research here.
These tests did refine the measurement of the safe vblank region to the first 24 scanlines, however, which is good because that is large enough to cover the length of NTSC vblank.
A few years ago, when I was programming The Banketh, I also had a series of graphic and sometimes ROM problems that proved my progress.
In the PC emulator it worked correctly. When I was recording on PRG and CHR on the board, it also worked correctly. But when I put the ROM in the SD and threw it in the Everdrive, errors appeared.
In the end it turned out that the ROM was not recorded correctly in the SD that it used and the solution was to change SD.
Once this was done, the ROM already worked well in the Everdrive.
I ran thefox's oam-decay-test.nes on my Everdrive and found the results pretty interesting. If I provide no input at all (so rendering is enabled, but OAM is not being rewritten each vblank), I lose 2 sprite tiles (tiles 48 and 56), usually at the same time, after about half a second, and then 4 more (tiles 26, 38, 46, 54) simultaneously at about 3 seconds, and occasionally another (tile 50) a little later. These results are pretty consistent. I also often have a gap of 2 missing, consecutive tiles immediately after reset, though never when opening the ROM from the Everdrive menu, and which 2 tiles seems fairly random.
If I hold A to induce decay, then the tiles scramble and eventually disappear, as expected, except for sprites 0 and 1, which remain untouched even after substantial time without rendering (1+ minute).
I asked someone with video capture to run the test on his own console+Everdrive and the results are similar to mine, but the tile loss with rendering enabled is even more extreme than on my hardware.
This clip shows tile loss while rendering is enabled, as well as the 2 missing tiles on reset.
This other clip shows decay from disabling rendering, with the same behavior of tiles 0 and 1 remaining after everything else has disappeared.
Fiskbit wrote:
except for sprites 0 and 1, which remain untouched even after substantial time without rendering (1+ minute).
This is actually as expected... "pclk0" never stops, and one of the rows of DRAM is always selected. So one row's worth of data, which if OAMADDR is 0 is the eight bytes for sprites 0 and 1 and the first byte of secondary OAM, won't lose its value when refresh is disabled.
lidnariq: Thanks, that makes sense!
I threw together my own OAM decay test because I wanted to gather more information (in particular, adding a numbered graphic for every possible sprite tile ID so they don't just disappear when the ID decays). Curiously, the unusual behavior I encountered with thefox's test is not present with mine: tiles aren't lost when rendering is enabled, and all of the tiles are present on reset [Edit: The tile IDs also don't change, as far as I can tell; there is no apparent change in properties for any tile]. I'll have to dig into his test to figure out what is functionally different.
My NMI loop is pretty different (disables NMIs after first triggering), but I had a more traditional setup along the way that didn't seem different at all. The one major difference between these tests that I'm currently aware of is that mine reads input in a tight loop, enabling and disabling rendering immediately regardless of where we are in rendering the current frame. Sprites 0 and 1 are not always the ones to be unaffected by decay, which I presume is because of this difference in timing.
I've attached the test in case anyone wants to play with it or look at the source.
Fiskbit wrote:
The one major difference between these tests that I'm currently aware of is that mine reads input in a tight loop, enabling and disabling rendering immediately regardless of where we are in rendering the current frame. Sprites 0 and 1 are not always the ones to be unaffected by decay, which I presume is because of this difference in timing.
Yeah, OAMADDR is known to be left at whatever random value when you disable rendering.
We've found that disabling rendering at the "wrong time", even if there are no sprites on that scanline, will corrupt OAM ... at least on the 2C02. I don't know about the 2C07.
Turning rendering off in PPUMASK ($2001) before the PPU has finished evaluating sprites for that line (x=192 for lines with no sprites, x=240 for lines with at least one sprite) can corrupt OAM, leading to sprite flicker.
I've made some progress on the issue with tiles disappearing and have attached a test for this. What I've found on my hardware is that by loading a value from zero page in a loop, bit 7 of the ID of some sprite tiles can become set without any clear cause. Which tiles are affected, if any, depends on the value being loaded from memory.
In the attached test, the dpad can be used to adjust the value (up/down by #$10, left/right by #$01). The current value is displayed using background tiles. On my console+Everdrive, some multiples of #$10 (value #$20, for example) reliably cause immediate change in tile ID. This is seen as a change in sprite color because tile IDs #$01 and #$81 are differently colored squares. All other sprite tiles are fully transparent in this version. The test works by having a tight loop where the value is loaded from zero page. The NMI handler does joypad reads, value adjustment, and background updates. OAM DMA for the diagonal line of sprites is only performed once, just before rendering is first enabled. The NMI doesn't seem to impact the OAM corruption; the glitching still occurs when setting the address to #$20 before the loop and never enabling NMIs.
What I have not yet tested, but intend to, is whether the zero page address matters. In my development of this test while things were still very much in flux, I was unable to get glitching when loading from $00 (the test loads from $1A), loading from an absolute addresses, doing nothing, or writing to zero page, but did get glitching with indexed zero page reads. Consider those results tentative, however; I plan to do more testing on those now that I have the test in a good state.
thefox's test encountered this issue because the NMI incremented a zero page frame counter, which was being waited on outside the NMI for the majority of the frame. Tiles would disappear at different times because of the time it would take for the counter to reach the value that affects those tiles. The other issue I've seen in his test (2 tiles missing on reset) is also present in this test. I've looked into it only a little bit. I know that the tile IDs aren't getting corrupted because the tiles remain invisible when I ensure all sprite tiles are non-transparent. I also encountered some resets where the tiles were visible, but at incorrect positions.
Finally, I asked BMF54123 to run this on his PowerPak and Everdrive. He was unable to reproduce the issue at all on either of two frontloader consoles (E and G revisions, I believe) or AV Famicom, though he did see the 2 missing tiles on reset on both frontloaders with both carts and the Famicom with the Everdrive (but had trouble reproducing the behavior on Famicom+Everdrive when going back to it). Given that changing to another AC adapter fixed similar glitching I've seen with bit 7 getting set on lag frames, I asked BMF if he could try another, but all of his have new caps installed. I still don't know if the power supply is a red herring or not, but I do find it interesting that it didn't happen on any console he tried, but happened for me and (with thefox's test) the person who I clipped on Twitch.
That's what I've got for now. Not too sure how to proceed on the tile ID bit 7 corruption issue at this point.
I decided to look into when reads need to be done in order to cause corruption. I first added a variable sprite 0 hit that controlled when the reads would begin and found that the corruption did not occur regardless of what visible scanline I started doing the reads on. I changed it to add a variable delay to the NMI handler to reduce amount of vblank time available to the read loop and found that this does indeed control whether the corruption happens. The size of the read loop window within vblank and the density of reads in that window determine whether corruption occurs and which / how many tiles are affected.
I've attached a new version of the test. This contains two ROMs. oam_corruption_stress_test.nes is a stress test that reads a frame counter in a fairly unrolled loop (63 zero page loads per loop) for almost all of vblank. In my testing, most corruption is instantaneous, so the frame counter simply increments once per frame and should cause most vulnerable tiles to corrupt within a few seconds. oam_corruption_test_v2.nes uses the same unrolled read loop, but allows the value being read to be controlled by the user via the dpad as well as a delay (9 CPU cycles per unit) via B/A. The delay can be used to find the smallest vblank window capable of corruption.
Note that both of these ROMs pack loads in tighter than the first version of the test, so corruption should be more likely. In my testing, the lowest delay I can go without seeing any corruption at all is 97, which means we start the read loop at about scanline 255 dot 240 (give or take typical NMI jitter). That window is only about 1/4 of vblank.
Edit: I got my hands on a Genesis model 1 AC adapter and gave that a try. Literally no difference; 97 is still the lowest I can go without corruption.
I cannot understand the physics of how you're getting these results.
Visual2C02 shows that the external data bus ("io_dbX") is only copied onto the internal data bus ("_dbX") when "_io_dbe" is asserted. "_io_dbe" is only asserted when the PPU is selected ("_io_ce" is low) and R/W is low.
"_io_ce" externally comes from the 74'139, and is only true while M2 and A13 are high and A14 and A15 are low. So I can see no mechanism why access to just zero page would cause this—your code is executing from $Cxxx, so at no point should A13 even bounce high for a glitchy moment....
Thanks for looking into this at all, though I was hoping it might make more sense to you! There's still a little more testing I can do on this, but I'm definitely reaching a point where I probably need to make a dev board to see how this behaves without a flash cart involved. I'll have to look into what parts I need for that.
Tonight, Eunos streamed these tests on Twitch for me, so I have some results to share from that. He was able to test an AV Famicom with a Famicom Everdrive, and two frontloaders with an NES Everdrive. Like with BMF54123's testing (also done on a Famicom Everdrive), the Famicom didn't exhibit any unusual behavior (though we didn't test for the 2 missing tiles on reset, which BMF did see). However, the frontloaders showed a tremendous amount of corruption of higher bits for ID and position. For tile colors, light blue is #$01, dark blue is #$81, and all other tiles are a mid-blue, which shows up in the first clip. Disregard the capture issue on the right of some of the clips, which is unrelated to the NES.
Frontloader 1 stress test (shows tile ID and position corruption)
Frontloader 2 stress test (shows tile ID bit 7 being both set and cleared)
Frontloader 1 delay test (corrupts around C1, ~scanline 258, cycle 339)
Frontloader 2 delay test (corrupts around 97, similar to my console)
Glitching in Zelda on Everdrive (left side of Link sprite when initiating scrolls, corresponding to 2 frames in normal overworld scrolling)
He also tried 3 different AC adapters on the frontloaders, but it made no clear difference, so that seems pretty unrelated. The full VOD is
here for now, but the clips above are the juicy bits.
For no good reason, the only flashcart I ever bothered to build was a mapper 218 one with only 8KiB of EEPROM. If your test can be crammed into that, I can see what happens.
Alright, I've attached 8 and 16 KB mapper 218 versions of the stress test from the v2 zip. I didn't have any luck getting an 8 KB ROM to work in my current version of Mesen using the NES 2.0 exponent-multiplier format, so I used the 16 KB version for testing and it seems to work fine. The 8KB one does not contain a header, so it should be ready to throw on a chip. Note that this requires horizontal mirroring; unless this causes very severe corruption, sprite tiles #$01 and #$81 need to be different for the corruption to be visible.
I didn't bother converting the other test for now, but can if this causes corruption. The other has to do much more work in the NMI, which reduces the maximum window for loads during vblank; it's really just for finding what values trigger corruption and how long is required in vblank.
You will probably be dismayed to find out that the 64 sprites stayed on-screen, light blue, and nothing ever changed.
(I initially incorrectly rewired the cart, in the PPUA10 variant instead of the needed PPUA11 variant, and then there was garbage in the upper right corner and the sprites all stayed dark blue, but that's not surprising)
Approximately 25% of the time when I (re/)boot it, two sequential sprites are missing.
Well, it's an interesting result, nonetheless, and I definitely appreciate the testing. Right now, it's not clear to me if this behavior is something specific to the console or the cart being tested (or the combination). Notably, this has only been reproduced on the NES-style Everdrive N8 (3 out of 3 tested), with no glitching on a dev board, a PowerPak, and 2 Famicom-style Everdrive, so I need to try to do testing with the NES Everdrive and some other method on the same console. I'll look into putting together a dev cart, and will probably have an opportunity to test on PowerPak and Famicom Everdrive in June. Does it seem plausible to you that the cart could cause this?
Definitely good to hear that you at least had 2 missing tiles on some resets and even on boot; I'd been unable to test the latter case. This issue seems to happen regardless of the hardware. I'll try to figure out what difference between these tests and my first OAM decay test causes this.
Fiskbit wrote:
Does it seem plausible to you that the cart could cause this?
Not really, but I've at least heard of the Everdrive causing other weirdness in the past.
And since you point out it doesn't happen on the FC everdrive nor the powerpak, so there's something extra bonus weird on the N8.
I got the opportunity to test this on some additional hardware a couple weeks ago. On my NTSC frontloader NES (NES-CPU-07, Rev G), both the NES and FC Everdrive N8s produced the same OAM corruption with oam_corruption_stress_test.nes, but the PowerPak did not produce any corruption. Notably, however, none of these same 3 flash carts produced any corruption on an A/V Famicom. I don't know if this means that model is immune to the problem or if that particular console is simply not susceptible to decay, since the severity of the decay seems to vary from one console to another. But, I think it's safe to say this decay issue is caused by something the Everdrive itself is doing.
The other interesting result was that the NES Everdrive on the A/V Famicom had substantial graphical problems in the form of bad slivers and IRQ mistiming. The latter, for example, caused the status bar to jump around in SMB3. The FC Everdrive didn't have any of these problems, though. The obvious difference is that the NES Everdrive had to be connected through an adapter, but the PowerPak on that same adapter had no problems at all, and no amount of cleaning or reseating seemed to make a difference for the Everdrive.