I was reading the manual of SGDK, a widely used software library for programming Sega Genesis homebrew, and the standard practice of reading the controllers in the vertical blanking interrupt handler and calling a button event listener from this handler surprised me. But in order to be prepared when I ask on SpritesMind as a new user why it's commonly done on Genesis, I'd like to have it articulated for the record why it's
not commonly done on NES.
sys.c in SGDK has a standardized vblank interrupt handler whose NES counterpart would read as follows:
Code:
.zeropage
intTrace: .res 1 ; flags for what interrupts are being handled
.bss
user_vint_before_vram: .res 2
user_vint_after_vram: .res 2
.code
nmi_handler:
pha
txa
pha
tya
pha
; Notify of entering interrupt context
lda #IN_VINT
ora intTrace
sta intTrace
; Do vblank tasks
jsr call_user_vint_before_vram
jsr push_vram_updates
jsr push_palette_updates
jsr ppu_set_scroll
jsr call_user_vint_after_vram
jsr audio_update
jsr joy_update
; Notify of leaving interrupt context
lda #<~IN_VINT
and intTrace
sta intTrace
pla
tya
pla
txa
pla
rti
call_user_vint_before_vram: jmp (user_vint_before_vram)
call_user_vint_after_vram: jmp (user_vint_after_vram)
joy.c reads the controllers as part of the vblank handler using code which, glossing over support for controllers other than the standard controller, is analogous to this:
Code:
.zeropage
joy_state: .res 2
.bss
user_input_event_handler: .res 2
.code
joy_update:
ldx #1 ; strobe both controllers
stx $4016
dex
stx $4016
jsr joy_update_port_x
ldx #1
joy_update_port_x:
ldy #$01
buttonloop:
lda $4016,x
and #$03 ; nonzero if button on hardwired or plug-in controller is pressed
cmp #$01 ; carry set iff pressed
tya
rol a ; Save this bit
tay
bcc buttonloop ; once initial $01 gets shifted to carry we're done
tya
eor joy_state,x
sty joy_state,x
bne no_buttons
jmp (user_input_event_handler)
no_buttons:
rts
; user_input_event_handler is called with
; X: controller port (0 or 1)
; Y: currently held buttons
; A: buttons that have changed since last frame
; It is also called in interrupt context, which means behind
; the back of the game logic and with all the game logic's
; call stack on the stack. A program might get confused
; in the case of a lag frame.
Why is a program structure like this, polling input in the NMI handler and calling a button state change event listener from the NMI handler, not commonly used on the NES? I know "lag frames" and "possibility of stack overflow" and the like, but explain it like I'm five because after I have coded for NES, Super NES, Game Boy, and Game Boy Advance, the Genesis
will make five.
I think regular polling via interrupt has one important use case: detecting a button press during "lag" frames.
If all your poll does is update the gamepad state variable, then it is useless. Unless you're specifically queuing an onset or release event that can be processed at the next opportunity, then there is no point in doing a poll that you're just going to discard.
In practice, on older systems this probably applies most to transition between rooms/screens where you don't normally do any input response for several frames. On modern systems, unpredictable multi-frame lag is commonplace even during regular gameplay, but event queues are also commonplace.
So I think the primary case here is: do you need to respond to a very transient button press and release (pair) that was made during some transition? If the answer is no, then you probably shouldn't be doing extra reads in NMI that you don't use. It can also open you up to threading issues with code that checks input state in various places (one more thing to be careful about).
I'd suggest that most NES games don't need this. "Normal" slowdown to only 30hz (or even 20hz) isn't really much of a use case for this. People just don't press and release buttons that fast. You need several consecutive frames without input response before it really matters.
Of course there's also your OAM DMA synchronization method for avoiding DPCM conflict, but if you know why and how you're doing this you don't need to refer to general rules, you need to think very critically about it. That is a special case technique, and certainly not for beginners.
Tecmo Super Bowl checks the joypad as part of the NMI unless the "task busy flag is set" indicating a task is in the process of being created/ended/suspended.
There is one mechanic in the game that requires this though. How fast you press the a button determines whether or not you throw or get tackled by another man player. It counts the number of A presses in a 64 frame window. Fast A tappers can hit 18-20 presses in a second.
tepples wrote:
I was reading the manual of SGDK, a widely used software library for programming Sega Genesis homebrew, and the standard practice of reading the controllers in the vertical blanking interrupt handler and calling a button event listener from this handler surprised me. But in order to be prepared when I ask on SpritesMind as a new user why it's commonly done on Genesis, I'd like to have it articulated for the record why it's not commonly done on NES.
Ideally, the game would read the player control and process the player's movements as the last step before committing to the contents of the next video frame. Reading the controller in NMI, after everything about the upcoming frame has been determined, would add more controller lag than reading it later in the frame. The controller lag might be more consistent, however, which could be a good thing for some games. If one is using the DMC for raster-split effects, the controller should be read from within an IRQ to ensure reliable results, but one could use an IRQ which is rather far down the frame so as to make the controller lag be short as well as consistent.
Note that if the main-line code is responsible for acting upon controller inputs, and if there's any possibility that it might lag, one should not have the controller-scanning logic try to set a "new button push" flag for a frame and clear it afterward, since a new button push that occurs on a lag frame might get missed. Instead, it may be better to either have the main-line code decide whether a button push is new based upon how the button state compares with the last state
it saw from the IRQ/NMI poll, or have the IRQ/NMI keep a count of how many button pushes it has seen and have the mainline keep a count of how many it has processed; the main line could then process a button push any time the counts don't match without ever missing any events unless the button gets pushed 256 times between polls.
hackfresh wrote:
Fast A tappers can hit 18-20 presses in a second.
For most people the number is more like 10, or lower. Here's a
famous video of a former mashing champion doing 13 (supposedly he did 16 in his prime). Here's a
vibrator assisted attempt that actually gets 20.
Though, personally, I think it's immoral to ask a user to destroy their controller and/or hands by rewarding them for doing harshly repetitive actions, but sure, if this is what you want to encourage polling in NMI can make this lag-resistant. (Alternatively you could just ensure the mashy part of your game doesn't lag-- and if your whole game is mashy, then I hate you.)
If you're not rewarding a player more for demi-human mashing speeds, then it's unimportant whether you're capturing the mash at 60 or 30 or 20 or even 15hz... all of these are capable of detecting a normal "rapid" mashing perfectly well. (BTW, a lot of people, especially young children, will have trouble doing even 5 taps per second.)
Otherwise, the question is for situations like: if I quickly tap and release B during the fadeout between rooms, should I fire a bullet immediately when the next room fades in? ("No" is probably a perfectly reasonable answer, but it's really up to you.)
rainwarrior wrote:
For most people the number is more like 10, or lower. Here's a
famous video of a former mashing champion doing 13 (supposedly he did 16 in his prime). Here's a
vibrator assisted attempt that actually gets 20.
Though, personally, I think it's immoral to ask a user to destroy their controller and/or hands by rewarding them for doing harshly repetitive actions, but sure, if this is what you want to encourage polling in NMI can make this lag-resistant. (Alternatively you could just ensure the mashy part of your game doesn't lag-- and if your whole game is mashy, then I hate you.)
Agreed. I
hate the mechanic myself especially the way the game is played competitively it it definitely skews things quite a bit to those who can tap fast. Also I should have said 18-20 only happens occasionally and its technically not in one second but 64 frames. The top guys avg ~ 13-16 per second.
Without getting into why games that require you to tap at inhuman speeds might have problems (there's no room for debate, that kind of design sucks) I will say that I read the controller inputs immediately at the start of vblank
in my megadrive framework. I have not had problems with this design.
tepples wrote:
But in order to be prepared when I ask on SpritesMind as a new user why it's commonly done on Genesis, I'd like to have it articulated for the record why it's not commonly done on NES.
Looking back toward the context of this original question...
Well, just look at joy.c. This probably works OK on Genesis, but would be ludicriously overcomplicated for an NES input handler. It's probably somewhat overcomplicated for Genesis, even, but it's not at all overcomplicated compared to a generic library like SDL.
What's appropriate for a very generic library on a (relatively) high powered platform is simply not appropriate in all cases.
There are no one size fits all rules for this.So... apples to oranges, IMO. Yes there's some overlap between what NES or Genesis might do, but SGDK is
completely outside that overlap. SGDK looks OK to me for what it is, though.
For SGDK it's not just simply "polling in vblank" or not, there's a bunch of library functionality built specifically on top of polling in vblank (event callbacks, waiting on button presses and counting time held, etc.). The alternative isn't an option here, you can't just flick a switch, that'd be a big redesign. The choice was made, and a bunch of stuff was built upon it. Features you don't necessarily need, but are okay to have in this kind of library context.
I'm not trying to make an argument that polling the controller in vblank
(on NES) is
bad, it can work perfectly fine, but I do think it brings a layer of complexity (threading) to input that you don't need otherwise... so I really wouldn't recommend it as a default option
(on NES).
(...and on a modern platform, it's a very standard thing to do, but like I said so are event queues and library APIs and other related things that are totally out of scope for NES.)
Edit: clarify where I'm talking about NES specifically.
rainwarrior wrote:
For SGDK it's not just simply "polling in vblank" or not, there's a bunch of library functionality built specifically on top of polling in vblank (event callbacks, waiting on button presses and counting time held, etc.). The alternative isn't an option here, you can't just flick a switch, that'd be a big redesign. The choice was made, and a bunch of stuff was built upon it. Features you don't necessarily need, but are okay to have in this kind of library context.
And which get linked into the executable, occupying ROM and RAM, even if unused.
rainwarrior wrote:
(...and on a modern platform, it's a very standard thing to do, but like I said so are event queues and library APIs and other related things that are totally out of scope for NES.)
Any platform that uses optical discs or a multitasking OS probably provides library support for a thread-safe queue data structure, which makes this sort of event-based paradigm practical in the first place.
One reason not to read controllers in the NMI handler is the consistency of controller state within the same logic frame. If a lag frame happens, and an NMI fires in the middle of that logic frame, it's possible that different controller states will be readable in that same logic frame. Of course you could circumvent that by not accessing the raw controller state byte directly from the game logic loop, instead making a copy of it when each logic frame begins and use that, but what good would that do if states would still be dropped during lag frames?
I guess that reading controllers in the NMI would only make sense if you OR'ed the new state with the previous one so that button presses would accumulate during lag frames until the game logic was ready to copy that accumulated state for consumption, clearing the buffer to receive new button presses. This would ensure that presses no shorter than 1/60th were always detected and handled, but you'd still have problems with quick tapping.
mikejmoffitt wrote:
I read the controller inputs immediately at the start of vblank
in my megadrive framework. I have not had problems with this design.
I don't know about the Mega Drive, but on the NES, where vblank time is at a premium, wasting part of it on a task that could literally be performed at any other time, would be completely nonsensical.
tokumaru wrote:
consistency of controller state within the same logic frame.
Thank you. This paragraph is the sort of reply I was looking for.
tokumaru wrote:
I guess that reading controllers in the NMI would only make sense if you OR'ed the new state with the previous one so that button presses would accumulate during lag frames until the game logic was ready to copy that accumulated state for consumption, clearing the buffer to receive new button presses.
Which leads me to the bigger problem that I found with SGDK's approach: how to safely "copy that accumulated state for consumption". Normally I guess the event handler would do something like this:
Code:
input_callback:
pha ; 1 bits correspond to buttons whose state changed
and joy_state,x ; 1 bits correspond to buttons just pressed
ora pressed_keys,x
sta pressed_keys,x
pla
eor #$FF ; 0 bits correspond to buttons whose state changed
ora joy_state,x ; 0 bits correspond to buttons just released
eor #$FF
ora released_keys,x
sta released_keys,x
rts
Then the main thread would do this:
Code:
ldy #0
lda pressed_keys,x
sty pressed_keys,x
But consider what happens in a lag frame when an NMI occurs at exactly the wrong place. The main thread would do this:
Code:
ldy #0
lda pressed_keys,x
<NMI>
sty pressed_keys,x
If the input callback ran during NMI and added key press or release events, those events just got clobbered.
rainwarrior wrote:
So... apples to oranges, IMO.
Would it be likewise "apples to oranges" between the NES and Super NES?
tokumaru wrote:
mikejmoffitt wrote:
I read the controller inputs immediately at the start of vblank
in my megadrive framework. I have not had problems with this design.
I don't know about the Mega Drive, but on the NES, where vblank time is at a premium, wasting part of it on a task that could literally be performed at any other time, would be completely nonsensical.
The big difference here is just how much time is allotted during the vblank in terms of CPU cycles. Input reading is something that is so quick that it makes sense to square it away without a lot of thought. However, my response was just pertaining to the Megadrive, as tepples asked about a detail in SGDK. I am fairly against the idea of calling an input handler, though, and prefer to allow game logic to read the last-good input state.
tepples wrote:
But consider what happens in a lag frame when an NMI occurs at exactly the wrong place. The main thread would do this:
Code:
ldy #0
lda pressed_keys,x
<NMI>
sty pressed_keys,x
Yes, that'd be a problem. Thankfully, at least on the NES, that copy would almost certainly happen at the very start of the logic frame, which takes place right after the previous NMI, so there'd be absolutely no chance of the pictured situation ever happening. That's just something the programmer needs to be conscious about, I guess.
If you read controller in NMI, your controller data can change in the middle of a frame. You don't want race conditions.
I imagine there are fighting games in the 90s era that quite possibly poll the joypad once in NMI and later somewhere in the main loop, or some similar model. Games that involve complication combo moves would fall under this category I'd think.
Street Fighter II reads inputs at the start of the game's polling loop. This is troublesome for emulator authors who don't update the virtual inputs frequently. If the game has a "turbo" setting on, it means that the number of game ticks that run per visual frame can vary, and with that, the frequency of input polls.
Dwedit - why would reading the gamepad during the NMI produce controller data changes mid-frame?
Just assuming you store your byte after you did the 8 joypad reads.
Race conditions with a vblank interrupt aren't necessarily a big deal. It depends how often you refer to your input, and what you do with it. If you handle your player control early on in a frame and never later on it'd be pretty unlikely to get clobbered even in a lag situation. There are lots of "convenient" reasons one might end up checking again later on in a frame though...
The main disadvantage is just that you've introduced another situation that
can go wrong if you're not aware of the threading / timing. Probably most of the time it wouldn't matter at all, but the kind of errors you get when you forget can be pretty insidious.
That's basically why I'd advocate not doing it
by default on the NES, but if you've got a reason you need it then sure go for it.
tepples wrote:
Would it be likewise "apples to oranges" between the NES and Super NES?
Ask me when you've got a concrete example for SNES, I don't want to get hypothetical. We're comparing a C games library API on Genesis to what's appropriate for NES, or at least, I was.
Polling in a vblank interrupt is a valid technique on NES, SNES, or Genesis. Polling on demand is also valid on all three of these platforms.
Should one way or the other be "standard" for any of these platforms? I don't care. I think I already stated what purposes it can be used for. If you've got a good use case for it, use it. If you don't, I'd probably recommend
not doing it for simplicity's sake. If you're using a library like SGDK, just roll with however it expects you to do things.
tepples wrote:
And which get linked into the executable, occupying ROM and RAM, even if unused.
No, they do not. These are modern tools, with LTO and other such conveniences the ancient platforms can only dream of.
LTO will notice you never call function X. That cascades. If you don't use any SGDK joy functions, they will be optimized out such that only the var update in the vblank func will remain.
tokumaru wrote:
One reason not to read controllers in the NMI handler is the consistency of controller state within the same logic frame. If a lag frame happens, and an NMI fires in the middle of that logic frame, it's possible that different controller states will be readable in that same logic frame.
I was going to say the exact same thing. This is really the only significant problem with reading the controller in NMI. I can't think of anything else. If the main code is designed robustly, either by making it's local copies during decision-taking routines, or by making sure the potentially slow code is always executed after the code using the controller-read value and taking decisions, it's fine. But the danger of having main codes behaving weirdly, or, in worst cases, even crashing, if the controller changes states during a lag frame, is there. This could also be a thing to exploit for speedruners.
For similar reasons, it would be bad practice to poll the controller multiple times during the same frame (*) - unless the logic is solid, and for example once only the directional pad are used and the other time only the buttons are used.
(*) I'm not including cases where reading the controller multiple times is part of the read routine
Also I disagree with tepples that games usually don't read controller during NMI. This practice might be not recommended, but commercial games still do it, often. Just check the NMI routine of your favourite games.
The only real advantage of reading the controller in NMI I can think of, is that you don't need to call the controller reading routine at many different places across the game's code. This saves a tiny bit of ROM. Other than that I can't think of any advantages.
tokumaru wrote:
I don't know about the Mega Drive, but on the NES, where vblank time is at a premium, wasting part of it on a task that could literally be performed at any other time, would be completely nonsensical.
Doing it before the VRAM updates would be completely nonsensical, but doign it in NMI after all VRAM updates, I really don't see the problem. Just like playing the music engine, by the way. And reading controllers in constant-cycle-timed routine is not hard to do, so it even doesn't clash with potential further-raster-effects-that-don't rely-on-sprite-zero-hit.
Yeah I'm also not sure what Tepples is talking about. It is commonly done in games. I once even thought it was preferable to have input reading in NMI, but Tokumaru taught me about the possible inconsistency it introduces, and I stopped doing it.
Also doesn't the SNES controller auto-reading function read the controllers in vblank or something?
Bregalad wrote:
The only real advantage of reading the controller in NMI I can think of, is that you don't need to call the controller reading routine at many different places across the game's code. This saves a tiny bit of ROM. Other than that I can't think of any advantages.
That could be done in the main loop as well. If you have a jump table for the various modes your game has in the main thread, you can just read the controllers before the jump table and jumping to the active mode's logic. As long as all modes are fine with the same controller reading routine it's enough with just one call.
calima wrote:
tepples wrote:
And which get linked into the executable, occupying ROM and RAM, even if unused.
No, they do not. These are modern tools, with LTO and other such conveniences the ancient platforms can only dream of.
LTO will notice you never call function X. That cascades. If you don't use any SGDK joy functions, they will be optimized out such that only the var update in the vblank func will remain.
LTO is only good if there is a means of indicating when a function call needs to be regarded as "opaque", and all functions where that is necessary are thus marked. When C was invented, there was no need for such a directive because all or nearly all extant compilers would extend the language by treating cross-module function calls as opaque(in fact such treatment was so universal it wasn't generally thought of as an "extension", but simply part of how the language worked), and thus very little code that relies upon such treatment that is marked to indicate such reliance.
Without such marking, something like:
Code:
volatile unsigned char data_ready_flag;
void wait_for_frame(void)
{
while(data_ready_flag)
;
}
...
databuff[0] = 123;
wait_for_frame();
databuff[0] = 45;
wait_for_frame();
may get "optimized" to remove one of the stores to databuff[0].
Bregalad wrote:
(*) I'm not including cases where reading the controller multiple times is part of the read routine
If one performs an OAM DMA within the NMI routine, and keeps track of whether an even or odd number of cycles have elapsed between then and the code that will read the controller, one can avoid the possibility of DMC clobbering the controller read. Likewise if one reads the controller at a suitable time within a DMC IRQ. While one needs to be mindful of how controller inputs will behave in the presence of lag, and ensure that such behaviors will always be sensible, I think reading then at a consistent point each frame is for many purposes better than just reading them "whenever".
supercat wrote:
If one performs an OAM DMA within the NMI routine, and keeps track of whether an even or odd number of cycles have elapsed between then and the code that will read the controller, one can avoid the possibility of DMC clobbering the controller read. Likewise if one reads the controller at a suitable time within a DMC IRQ. While one needs to be mindful of how controller inputs will behave in the presence of lag, and ensure that such behaviors will always be sensible, I think reading then at a consistent point each frame is for many purposes better than just reading them "whenever".
I've edited this note in quickly because when I claimed that reading controller twice per frame could be potentially harmful, I was sure someone came up with an annoying remark noticing how this is useful for DMC. Apparently even though I've edited this in, I still didn't avoid people quibble over that.
Bregalad wrote:
supercat wrote:
If one performs an OAM DMA within the NMI routine, and keeps track of whether an even or odd number of cycles have elapsed between then and the code that will read the controller, one can avoid the possibility of DMC clobbering the controller read. Likewise if one reads the controller at a suitable time within a DMC IRQ. While one needs to be mindful of how controller inputs will behave in the presence of lag, and ensure that such behaviors will always be sensible, I think reading then at a consistent point each frame is for many purposes better than just reading them "whenever".
I've edited this note in quickly because when I claimed that reading controller twice per frame could be potentially harmful, I was sure someone came up with an annoying remark noticing how this is useful for DMC. Apparently even though I've edited this in, I still didn't avoid people quibble over that.
My point was that if one is using DMC for video frame timing and reads controllers from the main loop, one will have to read them multiple times to make them reliable, but the necessity of reading them twice suggests that the controllers should be read differently.
I'm not sure I understand what is bein' posited? I think every NES game I've made strobes the controller in NMI, but not necessarily in vblank. Are we talkin' about readin' the actual inputs that the user... inputs? Like testin' if they pushed the 'A' button and all that jazz?
Yeah, we're talking about reading the state of the controller's buttons. The main argument against doing this in the NMI handler is that the state of the controller will change in the middle of lag frames, which could cause problems depending on how the game uses the input data, while the main argument for reading the controllers in the NMI handler appears to be the possibility of using the OAM DMA to avoid controller data corruption caused by DPCM sample fetches.
haha What a weird debate. Obviously it will depend on the engine bein' used, because if you KNOW there will be no input lag due to lagging frames, then implement it. Alternatively, if you can implement a DMC bug squash with it, then do it.
The way I tend to program though, I would shy away from anything not PPU related during vblank. But that's because everyone has a different way to implement things hehe No one is right or wrong, if I'm reading this correctly. Just all about implementation?
It's not about being right or wrong. It's about different possibilities and pros and cons, but yeah it heavily depends on what you are trying to do.
Also it is not about input lag, but rather the opposite (logic lags but not the controller). It's about a possible bug if you have slowdown in your game and have the controller reading in NMI instead of the main thread.
If the main thread lags and an NMI interrupts it, having the controller reading in the NMI will update the controller state again before the main thread has finished the logic. So the first part of the logic is processing one controller state and the second part (after PC has returned from NMI) is processing an updated controller state, which would potentially be the cause of a bug, although possibly rarely.
If you can guarantee you have no lag frames there is no problem in having the controller reading in NMI though, since there will be an equal number of main thread logic frames as NMI frames.
Either way NMI isn't strictly for vblank related things. Your sound driver is best updated in the NMI or the music will not have a consistent beat if you have lag frames (some commercial games messes this up too).
The DMC fix is very advanced, and I'm not sure it's fully PAL compatible either.
Quote:
The DMC fix is very advanced, and I'm not sure it's fully PAL compatible either.
The theory behind is in-depth but the implementation is plug and play, provided there's nothing in your game that makes controller reads right after OAMDMA directly unsuitable.
the PAL version doesn't have the dmc bug (it was fixed), so it should work either way you do it, rahsennors' cycle alignment fix or not. There might be problems with several emulators, though - the even/odd observation was only highlighted a few years ago and emulators might not have the bug correctly implemented. Then again, you need games that use the dmc fix to create incentive to fix emulation inaccuracies.
On a bigger computer, one should handle user input in irq/nmi handler and write new keystrokes to a keyboard queue. Else, it would be pretty frustating if the software would ignore all keystrokes or button clicks that had occurred while the main program was busy with things like command-execution, disk-loading or file-decompression.
On the NES, it doesn't really matter - mostly because there are no slow mechanical disk drives, and smoothly working games are per design fast and responsive enough to handle user input in main program.
In general, I would say that it can be better to handle user input on irq/nmi level (but, out of laziness, I tend to do it in main program).
For the argument about state changing unexpectedly during game logic, almost everybody has already pointed out that that could be solved by using a local copy, so that's a non-issue. When not using a local copy, I am sure that one could implement the same problem with and without nmi-based reading...
Code:
if read_buttons<>0
start_game
elseif read_buttons=0
do_not_start_game
else
program_has_failed
nocash wrote:
On a bigger computer, one should handle user input in irq/nmi handler and write new keystrokes to a keyboard queue.
[...]
For the argument about state changing unexpectedly during game logic, almost everybody has already pointed out that that could be solved by using a local copy, so that's a non-issue.
Synchronization while moving an event from such a queue into such a local copy might prove tricky, which is a big part of why I started the thread in the first place. Apparently on some other platforms with IRQ but no NMI, synchronization is done by disabling all interrupts, reading the tail element, moving the tail, and reenabling interrupts. This causes the interrupts to fire later, once they are reenabled.
For keyboard queues, I would use a write-pointer (updated by NMI handler), and a read-pointer (updated by main program). That should work without needing to disable NMI's (unless when using 16bit pointers on an 8bit cpu). But a queue is needed only if the main program is too slow to sense all press/release events at any time - but does still require not to miss those events.
A simple game could get away with the raw button state (without queue, and without distinguishing between clicks and doubleclicks). Just reading the raw state into accumulator should be problem: "a=[nmi_buttons]", or copy it to a separate variable "[buttons]=[nmi_buttons]". The only problem would be reading [nmi_buttons] multiple times (once to process fire button, and once again if the program had forgotten if it had processed the fire button (code working that way would be a bit dubious)).
But yeah, NMI handling needs to be more robust, and the main program needs to be aware that NMI handler can change things in background. In so far it's easier and perhaps more fool-proof to read buttons without NMI.
Controller reads in NMI seemed like a natural thing for me to do. Since I use the DMC interrupt for scroll splits, there is no reason to have the DMC run during VBlank, so I'm free to stop it, read the controller in NMI in a consistent number of cycles this way (no need for DMC glitch fix) and restart the DMC at the end of VBlank again for timing (when sprite0 is cleared).
FrankenGraphics wrote:
Quote:
The DMC fix is very advanced, and I'm not sure it's fully PAL compatible either.
The theory behind is in-depth but the implementation is plug and play, provided there's nothing in your game that makes controller reads right after OAMDMA directly unsuitable.
the PAL version doesn't have the dmc bug (it was fixed), so it should work either way you do it, rahsennors' cycle alignment fix or not. There might be problems with several emulators, though - the even/odd observation was only highlighted a few years ago and emulators might not have the bug correctly implemented. Then again, you need games that use the dmc fix to create incentive to fix emulation inaccuracies.
Problems with testing it on emulators is one reason I find this solution very peculiar and only works in some situations. Doesn't it require the OMA-DMA to come last in vblank? PAL requires OAM-DMA to execute early in vblank, forcing you to use two different rendering routines for NTSC and PAL compatibility. And it seems there are limitations of how the actual controller reading routine may look like which would further complicate the matter. The wiki says all the reads must be spaced an even number of cycles apart.
za909 wrote:
Controller reads in NMI seemed like a natural thing for me to do. Since I use the DMC interrupt for scroll splits, there is no reason to have the DMC run during VBlank, so I'm free to stop it, read the controller in NMI in a consistent number of cycles this way (no need for DMC glitch fix) and restart the DMC at the end of VBlank again for timing (when sprite0 is cleared).
As I've posted elsewhere, using the DMC for scroll splits is a good reason to leave it running all the time, but also a good reason to do the controller reading within the DMC interrupt handler.
pokun wrote:
Doesn't it require the OMA-DMA to come last in vblank?
It's not a requirement, as far as i remember. There's the inconvenience that you you get the order
OAMDMA, controller reads, other graphics updates, music.
rather than
OAMDMA, other graphics updates, controller reads, music.
for an update order that works on both systems.
So, mostly you're trading a bit of gain in faster execution for the loss of putting things in the most convenient order; in regards to what *needs* to be updated within the vblank (graphics, chiefly).
This would be a sizeable chunk of cycles saved if you're reading 2 controllers (or more).
Quote:
and it seems there are limitations of how the actual controller reading routine may look like
hmm... to what ends do you want to alter the example code? Removing 1 of the 2 controllers is simple enough; it still aligns correctly.
FrankenGraphics wrote:
There's the inconvenience that you you get the order
OAMDMA, controller reads, other graphics updates, music.
rather than
OAMDMA, other graphics updates, controller reads, music.
for an update order that works on both systems.
I was thinking this:
- if (PAL NES) { OAM DMA }
- VRAM updates
- if (not PAL NES) { OAM DMA }
- controller reading if requested
- audio
Yeah, as long as those vram updates are consistent or otherwise guaranteed to check out on an odd cycle so that the example code (or something equivalent) is aligned, that would work fine.
But you might have conditionals and varying lenghts of vram updates. Then it's not so simple anymore. (One might account for some compensation on branches taken/not taken using the same trick as the controller reading example, ie use indexed mode but with an index of 0 to add a cycle. Also, there's more need to watch out for the page boundary adding an off cycle).
In the not PAL NES case, the OAM updates immediately precede controller reading. In the PAL NES case, this defect in the DMA unit was fixed in the 2A07.
tepples wrote:
FrankenGraphics wrote:
There's the inconvenience that you you get the order
OAMDMA, controller reads, other graphics updates, music.
rather than
OAMDMA, other graphics updates, controller reads, music.
for an update order that works on both systems.
I was thinking this:
- if (PAL NES) { OAM DMA }
- VRAM updates
- if (not PAL NES) { OAM DMA }
- controller reading if requested
- audio
??? Why would you do this? There is no reason to have two separate orderings. PAL doesn't start its forced OAM refresh until 21 scanlines in. If an order works for NTSC's budget, it will still work for PAL.
"OAM DMA first" is just a rule of thumb, not any kind of requirement.
tepples wrote:
In the not PAL NES case, the OAM updates immediately precede controller reading. In the PAL NES case, this defect in the DMA unit was fixed in the 2A07.
lol, i forgot this little detail only a few posts after saying so

rainwarrior wrote:
tepples wrote:
I was thinking this:
- if (PAL NES) { OAM DMA }
- VRAM updates
- if (not PAL NES) { OAM DMA }
- controller reading if requested
- audio
??? Why would you do this? There is no reason to have two separate orderings. PAL doesn't start its forced OAM refresh until 21 scanlines in. If an order works for NTSC's budget, it will still work for PAL.
These 21 PAL lines are as long as 19.7 NTSC lines. The difference matters for games that are right on the bubble of being able to fit all updates into vblank.
But I think I've found the key part of an answer to the previous question: Because a non-maskable interrupt complicates locking an input queue, unlike other platforms where vblank is maskable.
tepples wrote:
But I think I've found the key part of an answer to the previous question: Because a non-maskable interrupt complicates locking an input queue, unlike other platforms where vblank is maskable.
Code which would be ill-prepared to handle an NMI may clear bit 7 of $2000 to prevent it from firing on schedule, such that if vblank happens when the system is unready the handler will either start late within the current vblank or else wait until the next one.
Further, an NMI handler can start with something like:
Code:
neverMind:
inc elapsed_frames
rti
NMI:
lsr nmi_lock
bcc neverMind
sta nmi_A
stx nmi_X
...
inc elapsed_frames ; May as well do this late in the NMI
ldx nmi_X
lda nmi_A
inc nmi_lock ; Note that nmi_lock will be zero during the handler, and nmi_A is not live here!
rti
and then enable the main NMI handler by writing 1 into nmi_lock. Any vblank that triggers an nmi before code writes 1 to nmi_lock will burn 27 cycles and update the elapsed_frames counter but have no other side-effects.
supercat wrote:
Code which would be ill-prepared to handle an NMI may clear bit 7 of $2000 to prevent it from firing on schedule, such that if vblank happens when the system is unready the handler will either start late within the current vblank or else wait until the next one.
Writing to $2000 during rendering can cause a shoot-through glitch, incorrectly clearing the 256s bit of X scrolling for one scanline. This only happens on some subpixel alignments, and only if the write collides with the correct pixel, but without timed code it's hard to be assured of missing that. As a result, if you're using 4-screen or horizontal layout of nametables, it's best to avoid using the NMI enabled bit to selectively mask NMIs at runtime in the NES.
lidnariq wrote:
supercat wrote:
Code which would be ill-prepared to handle an NMI may clear bit 7 of $2000 to prevent it from firing on schedule, such that if vblank happens when the system is unready the handler will either start late within the current vblank or else wait until the next one.
Writing to $2000 during rendering can cause a shoot-through glitch, incorrectly clearing the 256s bit of X scrolling for one scanline. This only happens on some subpixel alignments, and only if the write collides with the correct pixel, but without timed code it's hard to be assured of missing that. As a result, if you're using 4-screen or horizontal layout of nametables, it's best to avoid using the NMI enabled bit to selectively mask NMIs at runtime in the NES.
I don't recall having read of that problem. If one wants to change the B, S, or NN bits mid-frame, what update time must one avoid? And do you see any problems with the "soft-disable approach" shown (which, unlike hardware disabling of NMI, will allow code to know how many frames it missed)?
According to
previous findings and discussion, the OAM DMA-safe portion of PAL vblank is likely larger than NTSC vblank and the information on the wiki here is likely not accurate to the scanline. If your OAM DMA fits in NTSC vblank, it probably works on PAL.
As far as emulator compatibility for synchronized reads,
I tested various emulators in January and found that support among emulators that emulate joypad bit deletion is still very poor, with only Nintendulator passing all the current tests. Compatibility is better if your joypad read code is shorter than the time between two DMA reads (ie you're not reading 4 controllers). Unfortunately, Nestopia UE passes none of them and is widely used because of its libretro core, which has made me uncomfortable with using this in something I want people to be able to reliably play.
For the shoot-through glitch, my understanding based on
recent findings is that you can avoid the glitching if you write to $2000 or $2100 depending on whether the screen is currently on the left ($2000) or right ($2100) half of the nametables.
Edit: On the matter of the original topic, I would think that reading joypads in the NMI is fine, but I would only do it on non-lag frames, just as I would also only do OAM DMA and VRAM writes on non-lag frames to avoid drawing partial state. On the NES, I don't know what I would do with fresh joypad information partway through a frame and it definitely adds risk of bugs unless being put into a separate variable. The only real benefit I can see is being able to continue capturing press vs down information at a constant rate even when lagging, but I would not expect to ever see a button be indicated as freshly pressed on two consecutive frames, so this seems like it could also cause bugs.
Edit: Sorry I didn't see the above two posts before I posted this.
FrankenGraphics wrote:
pokun wrote:
and it seems there are limitations of how the actual controller reading routine may look like
hmm... to what ends do you want to alter the example code? Removing 1 of the 2 controllers is simple enough; it still aligns correctly.
I can think of many situations where you can't use the example code. Using other input devices, using a controller reading FDS BIOS call to save disk space, using a multitap etc. Having variable size vblank routines or not wanting to loose vblank time from doing the controller reading in vblank is probably very common as well. The DMC workaround seems to be a valid solution to the bug (especially for things like mouse and keyboard where you may not be able to read it twice), but it is an advanced solution that may require extra thought as soon as you are doing something different from the example code.
rainwarrior wrote:
tepples wrote:
FrankenGraphics wrote:
There's the inconvenience that you you get the order
OAMDMA, controller reads, other graphics updates, music.
rather than
OAMDMA, other graphics updates, controller reads, music.
for an update order that works on both systems.
I was thinking this:
- if (PAL NES) { OAM DMA }
- VRAM updates
- if (not PAL NES) { OAM DMA }
- controller reading if requested
- audio
??? Why would you do this? There is no reason to have two separate orderings. PAL doesn't start its forced OAM refresh until 21 scanlines in. If an order works for NTSC's budget, it will still work for PAL.
"OAM DMA first" is just a rule of thumb, not any kind of requirement.
So the required OAM-DMA timing for NTSC and PAL are the same? I can swear I read somewhere on the wiki that PAL requires an earlier OAM-DMA than NTSC which may even benefit from a late one when doing the mentioned DMC workaround.
In that case, disregard everything I said about the DMC workaround counting cycles not being compatible with PAL. NTSC and PAL can have identical vblank routines.
pokun wrote:
So the required OAM-DMA timing for NTSC and PAL are the same? I can swear I read somewhere on the wiki that PAL requires an earlier OAM-DMA than NTSC which may even benefit from a late one when doing the mentioned DMC workaround.
This article describes most of what you said
(link) - though it doesn't have to do with the DMC cycle avoidance.
I read it as such:
1) on NTSC you *might* extend the vBlank period through forced blanking to gain a longer period to update VRAM in, as an approximation of the PAL versions' extended vBlank. But in order to do this, you should put OAMDMA late, lest it decays.
2) Not only is this redundant for the PAL verison in that it already has a longer vBlank period, but also discouraged as some sort of automated OAM refresh takes place on the PAL version, which you don't want to have a conflict with. What the visual result from that confict is, is undescribed.
3) OAM does not decay on the PAL version due to said automatic OAM refresh.
4) So in order to extend the period in which it is suitable to update VRAM on NTSC, you need two different update orderings for the two versions.
So basically, if we only go by the article, OAMDMA shouldn't happen too late (for PAL compatibility), but it doesn't need to be absolutely first either. This not regarding what fiskbit just wrote about any time place within the
normal (not-forced) NTSC time window should be safe for PAL as well.
I don't think it was this page I saw. It explicitly said that NTSC and PAL may need different vblank routines, and not by extending the NTSC vblank with a forced blanking. It might have been edited out when regarded as false though.
surely u can, its determined by which period has more spare time to execute keys scanning, in v-blank or not.