I'm new to programming on the 6502. I've only been able to do simple things like move a character (as a sprite (I think))) around the screen with the d-pad made in nbasic (which compiles to nesasm assembly). Reading threads, it looks like nesasm is one of the stranger assemblers out there, so I'd like to try a more conventional one, like ca65. The problem is, I have no idea how to use it. The reason I started with nbasic and nesasm is that it was the only thing I could find simple, commented example programs for. And so, I humbly ask, how does one create a simple program that prints out "Hello World!"?
Also, being somewhat relevant, I have plenty of experience programming in higher level languages like C++ and Java, but have only recently learned some x86 assembly, and now, 6502 assembly.
You should learn most of the 6502 assembly, as that is the syntax all the assemblers use standardly, the only real features others add are tags and stuff. Look up in the assemblers files for the README or something along those lines to find what it offers and how to access those features like tags and such.
Here's my copy of "Hello world"
It demonstrates how to wait for the NES to warm up, clear the RAM, clear the nametables, load in a font, set the palette, wait for vblank, and turn the screen back on.
But it's not very well commented.
Dwedit wrote:
Here's my copy of "Hello world"
It demonstrates how to wait for the NES to warm up ...
Can you explain what this means and refers to, both on the hardware as well as in the assembly code? I'd like to know what "warm up" period is required, how long it is, and why it's needed, for the NES to become "usable". I'm not talking about register or RAM initialisation -- I'm talking about what it means to "wait for the NES to warm up".
Thanks!
EDIT: Is
this what "warm up" means? If so, okay got it. If not, an explanation would be cool.
I think by warm up, he means the 2 frames needed to go by before the APU is ready to output sound.
koitsu wrote:
I'd like to know what "warm up" period is required, how long it is, and why it's needed, for the NES to become "usable".
It seems that some parts of the NES do not behave consistently if used as soon as the system is powered up. A lot of PPU writes are ignored, for example (this has been verified by blargg, I think).
Because of that, it's good practice to wait at least 2 whole frames before interacting with the PPU or APU. The waiting can be done by pooling the VBlank flag, which appears to behave consistently during the warm up period.
FinalZero wrote:
Reading threads, it looks like nesasm is one of the stranger assemblers out there
Yes, NESASM doesn't have the reputation of being the most reliable assembler out there, but, unfortunately, NBASIC isn't that hot either. I seem to remember people here saying that programs made with it had compatibility issues with actual consoles, meaning its internal functions weren't properly coded. I think everyone will agree that a NES program that doesn't run properly on the NES is a huge FAIL.
The best way to make NES programs is obviously straight 6502 assembly, so if you think you can handle it that's definitely the best thing you can do. I'm not aware of any magical programming language that makes it easy to code NES programs that isn't either severely limited or just plain buggy.
tokumaru wrote:
It seems that some parts of the NES do not behave consistently if used as soon as the system is powered up. A lot of PPU writes are ignored, for example (this has been verified by blargg, I think).
Because of that, it's good practice to wait at least 2 whole frames before interacting with the PPU or APU. The waiting can be done by pooling the VBlank flag, which appears to behave consistently during the warm up period.
Two frames for the APU? The only mention of an APU delay is a 2048 cycle delay in
Brad Taylor's APU doc, which I conjectured to be waiting for the period dividers to finish their first cycle. My programs have always set up the APU while waiting for the PPU, especially because two IRQ sources are from the APU.
Quote:
I'm not aware of any magical programming language that makes it easy to code NES programs that isn't either severely limited or just plain buggy.
But sometimes severely limited is exactly what's needed. Consider WarioWare DIY: out of all the power of the DS, it limits the developer to 15 sprites of variable size, no sprite flipping, no background scrolling unless faked with sprites, no palette swapping or other sharing of animation between sprites, no state other than which of four animations it's in and one general-purpose flag, input as limited as the Zapper (tapping objects on the screen or tapping the background), and making a game longer than eight seconds requires save hacking.
tepples wrote:
My programs have always set up the APU while waiting for the PPU, especially because two IRQ sources are from the APU.
I see little advantage in doing anything during this waiting period (before you mention clearing RAM, I'll remind you that I'm against this), so aside from disabling IRQs I don't really do anything.
Thank you for all the replies.
Quote:
You should learn most of the 6502 assembly, as that is the syntax all the assemblers use standardly, the only real features others add are tags and stuff. Look up in the assemblers files for the README or something along those lines to find what it offers and how to access those features like tags and such.
Will do.
Dwedit wrote:
Here's my copy of "Hello world"
It demonstrates how to wait for the NES to warm up, clear the RAM, clear the nametables, load in a font, set the palette, wait for vblank, and turn the screen back on.
But it's not very well commented.
Trying to assemble it with ca65 produces a bunch of ".db is not a recognized control command" errors.
I see you made the font by just inserting it into the file with a bunch of .db's. Is there a way to include it from a separate file though?
Code:
load_font:
lda #font&255
sta addy
lda #font/256
sta addy+1
ldx #3
ldy #0
What do the first two lda's do? I don't understand the #font& and #font/ parts. Does a preceding # mean? Can that character be left out?
How would one print something out if their font table didn't match the ASCII ordering?
Code:
main_loop:
jsr waitframe
lda #0
sta $2005
sta $2005
lda #$18
sta $2001
lda #$80
sta $2000
jmp main_loop
Why is "sta $2005" repeated twice? Isn't that redundant?
This seems to be good documentation:
http://www.obelisk.demon.co.uk/6502/instructions.html
I see there's adc and sbc, but no add and sub. What does one do if they wish to add or subtract without the carry? Just clear the carry flag first? Also, it's strange that there's asl and lsr, but no asr and lsl.
Iirc, the NES's processor did away with Binary Coded Decimal capabilities, so what happened the to associated register flag? Does one get errors if it's ever set?
FinalZero wrote:
Dwedit wrote:
Here's my copy of "Hello world"
It demonstrates how to wait for the NES to warm up, clear the RAM, clear the nametables, load in a font, set the palette, wait for vblank, and turn the screen back on.
But it's not very well commented.
Trying to assemble it with ca65 produces a bunch of ".db is not a recognized control command" errors.
ca65 uses .byt instead.
Quote:
I see you made the font by just inserting it into the file with a bunch of .db's. Is there a way to include it from a separate file though?
.include "somesourcefile.s"
.incbin "somebinaryfile.chr"
Quote:
Code:
load_font:
lda #font&255
sta addy
lda #font/256
sta addy+1
ldx #3
ldy #0
What do the first two lda's do? I don't understand the #font& and #font/ parts. Does a preceding # mean? Can that character be left out?
# means the following value is a number to put directly into A, not an address from which to load the value into A. Look at the "immediate" addressing mode.
Quote:
How would one print something out if their font table didn't match the ASCII ordering?
There are two ways. Super Mario Bros. does it by specifying the character encoding within the assembler. (In ca65 use
.charmap commands.) Mega Man 5 does it by translating ASCII characters through a lookup table to get glyph tile indices.
Quote:
Why is "sta $2005" repeated twice? Isn't that redundant?
PPUSCROLL ($2005) is a port on the PPU. Alternating writes set horizontal and vertical background scroll coordinates. You need two writes so that the first one sets the horizontal and the second one sets the vertical.
Quote:
This seems to be good documentation:
http://www.obelisk.demon.co.uk/6502/instructions.htmlI see there's adc and sbc, but no add and sub. What does one do if they wish to add or subtract without the carry? Just clear the carry flag first?
Yes. A lot of programmers define ADD and SUB macros that clear carry before adding or set carry before subtracting.
Quote:
Also, it's strange that there's asl and lsr, but no asr and lsl.
LSL is redundant, as both arithmetic and logical shift put a 0 in the one's place. There is ASR, just not as one instruction: do a CMP #$80 followed by ROR.
Quote:
Iirc, the NES's processor did away with Binary Coded Decimal capabilities, so what happened the to associated register flag? Does one get errors if it's ever set?
The decimal flag exists, but the adder just ignores it. CLD and SED are as good as NOP.
Quote:
ca65 uses .byt instead.
or .byte, apparently. So, I changed those and another to .word, and... it assembled! But it doesn't run on FCEU. =/
Quote:
.include "somesourcefile.s"
.incbin "somebinaryfile.chr"
Simple enough.
Quote:
# means the following value is a number to put directly into A, not an address from which to load the value into A. Look at the "immediate" addressing mode.
Ah, I see.
Quote:
There are two ways. Super Mario Bros. does it by specifying the character encoding within the assembler. (In ca65 use .charmap commands.) Mega Man 5 does it by translating ASCII characters through a lookup table to get glyph tile indices.
That .charmap command looks very attractive. Does the second method described translate everything at assembly time, or at run time? Also, do the .charmap commands take effect globally?, or only after? Looking at the Hello World example given:
Code:
.byte "NES",$1A,$01,$00,$20,$00
The .charmap command would change the "NES" away from what was intended, no?
Quote:
PPUSCROLL ($2005) is a port on the PPU. Alternating writes set horizontal and vertical background scroll coordinates. You need two writes so that the first one sets the horizontal and the second one sets the vertical.
Is there documentation that describes all those special things at addresses like that? Looking at that example file given, I see...
Code:
OAM = $0200
PPUCTRL = $2000
PPUMASK = $2001
PPUSTATUS = $2002
PPUSTAT = $2002
SPRADDR = $2003 ; always write 0 here and use DMA from OAM
PPUSCROLL = $2005
PPUADDR = $2006
PPUDATA = $2007
SPRDMA = $4014
SNDCHN = $4015
JOY1 = $4016
JOY2 = $4017
Quote:
Yes. A lot of programmers define ADD and SUB macros that clear carry before adding or set carry before subtracting.
Alright.
Quote:
LSL is redundant, as both arithmetic and logical shift put a 0 in the one's place.
A strange property; x86 does the same. Perhaps they did it because putting a 1 in the one's place would be rather useless, despite making the set of commands regular.
Quote:
There is ASR, just not as one instruction: do a CMP #$80 followed by ROR.
Ok.
Quote:
The decimal flag exists, but the adder just ignores it. CLD and SED are as good as NOP.
Ok.
-----
By the way, is there a way of doing multiline comments like /* */ in C? Or must one prefix every line with a semicolon?
Also, while we're on the topic of character maps, how would something like Dual Tile Encoding be implemented? Also, are there any Japanese games that implement dakuten and handakuten for their characters by having a dakuten/handakuten tile and simply overlapping it with another, as opposed to creating a whole new set of tiles with the dakuten/handakuten added?
FinalZero wrote:
Code:
load_font:
lda #font&255
sta addy
lda #font/256
sta addy+1
ldx #3
ldy #0
What do the first two lda's do? I don't understand the #font& and #font/ parts. Does a preceding # mean? Can that character be left out?
As tepples described, the "#" character means immediate addressing -- think "literal value". More on that in a sec.
The &255 and /256 syntax appears to be a bunch of nonsense for getting the low and high bytes of the 16-bit address of a label (in this case, the low and high bytes, respectively, of font). I call it nonsense because
there's some history indicating only devel builds of ca65 support the .LOBYTE and "<" directives, as well as .HIBYTE and ">" directives. God I hate that assembler. I seriously don't understand why people bother with it.
Here's a more much more common syntax you'll see, and makes a lot more sense IMHO:
Code:
lda #<font
sta addy
lda #>font
sta addy+1
In English: if font is located at location $894F, then the first LDA will load the value $4F into the accumulator, while the second LDA would load $89. If you removed use of immediate addressing, you'd have:
Code:
lda <font
sta addy
lda >font
sta addy+1
In English: if font is located at $894F, then the first LDA will load the value stored in memory at location $4F into the accumulator, while the second LDA would do the same but for from location $89.
FinalZero wrote:
By the way, is there a way of doing multiline comments like /* */ in C? Or must one prefix every line with a semicolon?
There's no "universal standard" for this. It's important to understand that assembler syntaxes vary greatly depending on the assembler, and it's your responsibility to learn/read up on what your assembler-of-choice supports. You should never assume that an assembler has something even remotely "C-like" in it.
Generally speaking, most assemblers assume that anything past (and including) a semicolon are comments. Example:
Code:
some_code ; in-line comment
some_code ; hello world
some_code ; i like rice
;
; Below code rides snakes around Arabia, or Persia,
; or China, or tepple's living room.
;
some_code
some_code
...
Some assemblers support things like the .COMMENT and .ENDCOMMENT pseudo-directives, which would act more like a /* */ block-style comment in C. E.g.:
Code:
.COMMENT
Below code rides snakes around Arabia, or Persia,
or China, or tepple's living room.
.ENDCOMMENT
some_code
some_code
...
But again, there's no "standard" -- you need to read the documentation associated with your assembler to find out. If there is no documentation, don't bother using that assembler.
Hope this helps. Cheers!
Quote:
God I hate that assembler. I seriously don't understand why people bother with it.
What assembler do you use?
Quote:
There's no "universal standard" for this. It's important to understand that assembler syntaxes vary greatly depending on the assembler, and it's your responsibility to learn/read up on what your assembler-of-choice supports. You should never assume that an assembler has something even remotely "C-like" in it.
Digging through ca65's documentation there's a ".feature c_comments" option, thus allowing multiline comments.
FinalZero wrote:
Quote:
God I hate that assembler. I seriously don't understand why people bother with it.
What assembler do you use?
There's
already a relevant thread pertaining to assembler debates and choices. I personally tend to stick to asm6 and x816, as their syntax and overall style mimic what I'm used to (ORCA/M and Merlin 16+).
koitsu wrote:
FinalZero wrote:
Quote:
God I hate that assembler. I seriously don't understand why people bother with it.
What assembler do you use?
There's
already a relevant thread pertaining to assembler debates and choices. I personally tend to stick to asm6 and x816, as their syntax and overall style mimic what I'm used to (ORCA/M and Merlin 16+).
Ah, ok. I suppose the reason I'm trying out ca65 at the moment is because it has current development, which is likely to fix any bugs found (hopefully), and has lots of documentation.
Quote:
Here's a more much more common syntax you'll see, and makes a lot more sense IMHO: [...]
I understand. Thank you.
FinalZero wrote:
Quote:
PPUSCROLL ($2005) is a port on the PPU.
Is there documentation that describes all those special things at addresses like that?
nesdevwiki: PPU registersHave fun.
Quote:
Also, while we're on the topic of character maps, how would something like Dual Tile Encoding be implemented?
Wikipedia explains.
Quote:
Also, are there any Japanese games that implement dakuten and handakuten for their characters by having a dakuten/handakuten tile and simply overlapping it with another, as opposed to creating a whole new set of tiles with the dakuten/handakuten added?
Some games write kana on two rows of tiles, where the dakuten is at the lower right corner of the higher tile. You can't "overlap" tiles per se without sprites, and you can't fit more than 8 sprites on a row of text without some of them disappearing.
koitsu wrote:
there's some history indicating only devel builds of ca65 support the .LOBYTE and "<" directives, as well as .HIBYTE and ">" directives.
Every build of ca65 for all the years I've used it has supported the < and > unary operators.
koitsu wrote:
The &255 and /256 syntax appears to be a bunch of nonsense for getting the low and high bytes of the 16-bit address of a label (in this case, the low and high bytes, respectively, of font). I call it nonsense because
there's some history indicating only devel builds of ca65 support the .LOBYTE and "<" directives, as well as .HIBYTE and ">" directives. God I hate that assembler. I seriously don't understand why people bother with it.
That's not true, you misunderstood the thread. The "issue" was about .LOBYTES/.HIBYTES, not .LOBYTE/.HIBYTE. LOBYTE/.HIBYTE/</> have always worked fine. And the problem Neil was having could have been fixed by specifying that ZP is a zero page segment with ".segment "ZP":zeropage".
FinalZero wrote:
Digging through ca65's documentation there's a ".feature c_comments" option, thus allowing multiline comments.
Another way to do something similar is to use .if, like so:
Code:
.if 0
lda #0
sta xyzzy
.endif
I seem to remember that the code between the .if block has to have valid syntax, though.
thefox wrote:
koitsu wrote:
The &255 and /256 syntax appears to be a bunch of nonsense for getting the low and high bytes of the 16-bit address of a label (in this case, the low and high bytes, respectively, of font). I call it nonsense because
there's some history indicating only devel builds of ca65 support the .LOBYTE and "<" directives, as well as .HIBYTE and ">" directives. God I hate that assembler. I seriously don't understand why people bother with it.
That's not true, you misunderstood the thread. The "issue" was about .LOBYTES/.HIBYTES, not .LOBYTE/.HIBYTE. LOBYTE/.HIBYTE/</> have always worked fine. And the problem Neil was having could have been fixed by specifying that ZP is a zero page segment with ".segment "ZP":zeropage".
Thank you for the clarification/correction. For me, even more disparaging -- and I didn't know this until reviewing ca65's user manual -- is the fact that there are multiple assembler pseudo-ops which are named near-identically:
.HIBYTE vs.
.HIBYTES
.LOBYTE vs.
.LOBYTES
.BYT vs.
.DBYT
.DEFINE vs.
.DEFINED
These might not be negatives to those already familiar with the assembler, but I can see an experienced person messing these up. The last example is a stretch given the syntax and operational context, but you get the idea. I'm sure someone could say the same about other assemblers and their syntactical pieces, but the above naming convention seems to imply either "creeping featurism" or an assembler that's trying to do too many things / cater to too many architectures.
tepples wrote:
FinalZero wrote:
Quote:
PPUSCROLL ($2005) is a port on the PPU.
Is there documentation that describes all those special things at addresses like that?
nesdevwiki: PPU registersHave fun.
Quote:
Also, while we're on the topic of character maps, how would something like Dual Tile Encoding be implemented?
Wikipedia explains.
Quote:
Also, are there any Japanese games that implement dakuten and handakuten for their characters by having a dakuten/handakuten tile and simply overlapping it with another, as opposed to creating a whole new set of tiles with the dakuten/handakuten added?
Some games write kana on two rows of tiles, where the dakuten is at the lower right corner of the higher tile. You can't "overlap" tiles per se without sprites, and you can't fit more than 8 sprites on a row of text without some of them disappearing.
koitsu wrote:
there's some history indicating only devel builds of ca65 support the .LOBYTE and "<" directives, as well as .HIBYTE and ">" directives.
Every build of ca65 for all the years I've used it has supported the < and > unary operators.
1) Thank you.
2) So, one would make a function to transform something from DTE that operates on every string desired? I don't quite understand.
3) Ah, interesting.
Quote:
.BYT vs. .DBYT
What is the name ".dbyt" supposed to stand for anyways?
-----
I still can't get the Hello World Example to work. It assembles, but doesn't work.
-----
I used the following code in nbasic to read the first gamepad. I suppose it spoiled me, because I can't figure out how to transform the set C1_... lines into ca65 assembly. Am I supposed to push the results onto the stack?
Code:
// Updates the statuses of the first gamepad's buttons.
UpdateGamepad1:
// Strobe Bytes
set $4016 1 // First
set $4016 0 // Second
set C1_A & [$4016] 1
set C1_B & [$4016] 1
set C1_Select & [$4016] 1
set C1_Start & [$4016] 1
set C1_Up & [$4016] 1
set C1_Down & [$4016] 1
set C1_Left & [$4016] 1
set C1_Right & [$4016] 1
return
//------------------------------------------------------------
This might help. It is my entire joypad read routine, coded for ca65:
Code:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; joypad_read
;;
;; On input: Nothing.
;; On Exit: 'joypad1' and 'joypad2' contain 8 button states, in normal NES order:
;; a, b, sel, start, up, down, left, right
;; Destroys X.
;; Code taken from http://nesdevwiki.org/index.php/Gamepad_code.
;; WARNING: Will randomly fail if using DCPM sound channel.
;; http://nocash.emubase.de/everynes.htm, "Controllers - Joypads"
.segment "KERNEL"
.proc joypad_read
lda joypad1_orig
sta joypad1_prev ; Used for calculating 'one-shot' button mode.
lda joypad2_orig
sta joypad2_prev
ldx #09 ; Set counter to '8' and set strobe bit
stx JOYPAD_1 ; Strobe pad #1 + #2
dex ; X = 8 (loop counter)
stx JOYPAD_1 ; Clear strobe bits.
: lda JOYPAD_1 ; bit0 = input button.
lsr ; bit0 -> carry
rol joypad1 ; bit0 <- carry
lda JOYPAD_2
lsr
rol joypad2
dex
bne :-
lda joypad1
sta joypad1_orig ; Needed for future one-shot calculation.
lda joypad2
sta joypad2_orig
rts
.endproc
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; joypad_debounce
;;
;; Pass bitmask of buttons to debounce in 'A'. That is, if the button
;; was pushed last frame, and this frame, then make it appear OFF this frame.
.segment "KERNEL"
.proc joypad_debounce
;; On entry, Acc is mask of buttons to cook.
pha ; Save mask for joypad #2
and joypad1 ; Which buttons are pressed down?
and joypad1_prev ; Where they down last frame?
eor joypad1 ; Turn them off this frame.
sta joypad1
pla
and joypad2
and joypad2_prev
eor joypad2
sta joypad2
rts
.endproc
(slightly off topic): If my dual-joypad reading and debouncing code is acceptable to the powers-that-be, then feel free to add it to the wiki.
In my own game I am contemplating merging the two functions. I just have to track down all calls to each to ensure that I never read without debouncing immediately afterwards.
I also have the following in my "./src/nes.inc" header file:
Code:
JOYPAD_1 = $4016
JOYPAD_2 = $4017
BTN_A = $80
BTN_B = $40
BTN_SEL = $20
BTN_START = $10
BTN_UP = $08
BTN_DOWN = $04
BTN_LEFT = $02
BTN_RIGHT = $01
BTN_ALL = $ff
And this in yet another file:
Code:
djenkins@hera ~/code/nesyar $ grep joypad src/globals.s
joypad1: .byte 0 ; Cooked value for this frame.
joypad2: .byte 0
joypad1_prev: .byte 0 ; Original value from last frame.
joypad2_prev: .byte 0
joypad1_orig: .byte 0 ; Original value from this frame.
joypad2_orig: .byte 0
Quote:
Code:
lda joypad1_orig
sta joypad1_prev ; Used for calculating 'one-shot' button mode.
lda joypad2_orig
sta joypad2_prev
Are these defined somewhere earlier?
Edit: Nevermind. Serves me right for not refreshing the page before posting...
I should have made one post instead of two.
It helps if you know what I mean by "joypad debouncing". I don't know if others use the same terms. I adopted "debounce" from the idea of electrically debouncing push buttons using a timer circuit (like a 555 in one-shot mode).
In my game, the user can hold down "B" to make the Yar fly "faster" (like in SMB). But the user must press and release "A" to fire a bullet. Select changes the "weapon/gun" type, and Start pauses/unpauses the game. Both start and select must be debounced, or if the user holds them down the mode will switch every frame. So I want to debounce "A" and not "B" nor the direction keys. So my actual game engine does this:
Code:
jsr joypad_read
lda #(BTN_A | BTN_SEL | BTN_START)
jsr joypad_debounce
jsr check_for_pause_unpause
I was able to infer from the context what you meant. I've already implemented such a thing in nbasic. There was an example that showed me how.
Code:
// Handles input from the first controller.
HandleInput:
// Updates the first gamepad.
gosub UpdateGamepad1
// gosub UpdateGamepad2
// A Button
if C1_A = 0 set C1_AIsPressed 0
if C1_A = 1 if C1_AIsPressed = 0 then
set C1_AIsPressed 1
set SpriteNum + SpriteNum 1
endif
// B Button
if C1_B = 0 set C1_BIsPressed 0
if C1_B = 1 if C1_BIsPressed = 0 then
set C1_BIsPressed 1
set SpriteNum - SpriteNum 1
endif
// Start Button
if C1_Start = 0 set C1_StartIsPressed 0
if C1_Start = 1 if C1_StartIsPressed = 0 then
set C1_StartIsPressed 1
set SpriteX 128 // 128 * 2 = 256
set SpriteY 120 // 120 * 2 = 240
endif
// Select Button
if C1_Select = 0 set C1_SelectIsPressed 0
if C1_Select = 1 if C1_SelectIsPressed = 0 then
set C1_SelectIsPressed 1
set SpriteNum + SpriteNum $F // Spritnum += 16
// Displays the menu.
if MenuIsDisplayed = 0 then
set MenuIsDisplayed 1
// fix
endif
endif
// Up
if C1_Up = 1
set SpriteY - SpriteY C1_Up
goto EndHandleInput
// Down
if C1_Down = 1
set SpriteY + SpriteY C1_Down
goto EndHandleInput
// Left
if C1_Left = 1
set SpriteX - SpriteX C1_Left
goto EndHandleInput
// Right
if C1_Right = 1
set SpriteX + SpriteX C1_Right
goto EndHandleInput
EndHandleInput:
return
//------------------------------------------------------------
Edit:
Ok, so here's a simple piece of code which I think would work to read in stuff from controller without debouncing. I'm itching to try it out!
Code:
; Gamepad Locations
GamepadA = $4016
GamepadB = $4017
;---------------------------------------------------------------------------
; Button Flags
Button_A = %10000000
Button_B = %01000000
Button_Select = %00100000
Button_Start = %00010000
Button_Up = %00001000
Button_Down = %00000100
Button_Left = %00000010
Button_Right = %00000001
;---------------------------------------------------------------------------
GamepadAFlags: .byte 0
GamepadBFlags: .byte 0
;GamepadAPrev: .byte 0
;GamepadBPrev: .byte 0
;GamepadAOrig: .byte 0
;GamepadBOrig: .byte 0
;---------------------------------------------------------------------------
; Updates the flags for the first gamepad.
UpdateGamepadAFlags:
; Strobe Bytes
; First
ldx 1
stx GamepadA
; Second
ldx 0
stx GamepadA
.repeat 8
lda GamepadA ; bit0 = current button
shr ; bit0 -> carry
rol GamepadAFlags ; bit0 <- carry
.endrepeat
rts
;---------------------------------------------------------------------------
Edit: Another question: What's the difference between .proc and .scope for ca65? Their descriptions are nearly the same in the documentation:
http://www.cc65.org/doc/ca65-11.html#ss11.79
http://www.cc65.org/doc/ca65-11.html#ss11.86
FinalZero wrote:
So, one would make a function to transform something from DTE that operates on every string desired?
There are two ways to do this: A. make a function that takes a DTE string and returns a decompressed string, or B. make a function that starts a decoding process and retrieves the next decompressed character.
Quote:
Quote:
.BYT vs. .DBYT
What is the name ".dbyt" supposed to stand for anyways?
Double byte, I guess.
The difference I can see between .proc and .scope is that .proc defines a label in the outer scope for the start.
Quote:
The difference I can see between .proc and .scope is that .proc defines a label in the outer scope for the start.
I understand now.
Well, since the hello world example given earlier doesn't work, I've begun to look elsewhere. The following is from lj65's source.
Code:
.segment "INESHDR"
.byt "NES", 26
.byt 1 ; number of 16 KB program segments
.byt 1 ; number of 8 KB chr segments
.byt 0 ; mapper, mirroring, etc
.byt 0 ; extended mapper info
.byt 0 ; number of 8 KB RAM segments
; The next two bytes are supposed to define PAL/NTSC versions
; of a given ROM, but they make Nintendulator cry.
.ifdef PAL
.byt 0
.byt %00000000
.else
.byt 0
.byt %00000000
.endif
.byt 0
.byt 0,0,0,0 ; DiskDude detection
1) What does the ".segment "INESHDR" do? There's no predefined segment by that name, so I'm confused. What would happen if it was left out?
2) Does "number of 16 KB program segments" mean that one has go through one's source and count how many segments there are?
3) What is a mapper? Why do different games use different ones? I keep finding lists of them, but nothing that actually explains what they are and do.
4) What is a nametable? What's stored in them? And yes, I've read
http://wiki.nesdev.com/w/index.php/PPU_nametables5) From
http://wiki.nesdev.com/w/index.php/Mirroring :
Quote:
Vertical [...] This is most commonly used for games which only scroll horizontally. Games that scroll vertically (by any amount and without status bar) and that does [sic] never scroll horizontally by more than one screen would use this mirroring (e.g. Lode Runner, Bomberman, Fire Emblem, Crystal Mines), so that they don't have to load anything when scrolling horizontally.
I'm confused because it seems to contradict itself. Does that mean it's used for games that scroll horizontally or not?
Also, why doesn't one need ".org $0000" at the beginning of the header? Is it implied?
In ca65, you let the linker assign the final location in the output file of each segment. You are discouraged from using ".org".
It seems that each of us that uses ca65 builds our "ines" headers a bit differently (as far as how we get the linker to pre-pend it to the beginning of our ROMs). My example is below. This is for a game that I tinkered with a few years ago and gave up on. But it does link and execute properly.
Specifically, look at "src/linker.cfg", then at "src/kernel/header.s".
In your ".s" files, you just set the "segment" and then write code and define data bytes (words or whatever you need). You tell the linker, through a config file, how to lay each segment out into the ROM image. The linker will take care of doing the "fixups", so that "lda player1_health" gets the correct address for the symbol "player1_health". However, ca65 knows NOTHING about bankswitching. You must ensure that the correct banks are switched in (using the "mapper") before accessing a symbol unique to that bank.
My game uses MMC1 with a very large PROG-ROM. I have code in a few banks.
If you know what "subversion" (revision control system) is, you can obtain a complete copy of my source and build it on Linux. YMMV when compiling on win32 (my windows build scripts assume that you have msdev studio 98 (vc6.0) installed).
https://www.ecoligames.com/svn/nes-game/trunk/
Subversion copy:
svn co
https://www.ecoligames.com/svn/nes-game/trunk clueless-game
View source with syntax highlighting via "trac":
https://www.ecoligames.com/trac/nes-game/browser/trunk
(I forget the user/pass that you'll need. If you want it, ask and I'll go dig it up).
I hope that the above helps you. If not, I know that many others use ca65 and can help too.
Quote:
(I forget the user/pass that you'll need. If you want it, ask and I'll go dig it up).
Could you please?
I thought that I had posted it to this forum in 2008. So I went back and dug through my posts. I did not find the "public" user/pass, so I will reset it in a moment. But I did find this (about ca65 and linking). Maybe it will help you as well:
http://nesdev.com/bbs/viewtopic.php?p=37306
The user / pass for "trac" is "anonymous" and "nesdev". It should have full read access to the trac repository. Note that you shouldn't need a user/pass to do an "svn co".
FinalZero wrote:
1) What does the ".segment "INESHDR" do? There's no predefined segment by that name, so I'm confused. What would happen if it was left out?
You wouldn't get a valid iNES header
Quote:
2) Does "number of 16 KB program segments" mean that one has go through one's source and count how many segments there are?
Usually your link script template will specify how much PRG ROM it'll create. For example, an NROM-128 template will always make 16400 bytes (16 bytes of header and 16384 bytes of PRG ROM), an NROM-256 or CNROM template will always make 32784 bytes (16 bytes of header and 32768 bytes of PRG ROM), and a template for UNROM, BNROM, or ANROM will probably make 131088 bytes (16 header, 131072 PRG ROM).
Quote:
3) What is a mapper? Why do different games use different ones? I keep finding lists of them, but nothing that actually explains what they are and do.
They turn the page in the ROM, so to speak. Without them, the NES can't see more than 32 KiB of program and 8 KiB of graphic tiles. Some mappers have smaller "pages", which lets them have more than one active at once, and some have extra timing circuitry to make a few scrolling tricks easier.
Quote:
4) What is a nametable? What's stored in them?
In
List of background topics, please see
Text mode and
Text user interface. The nametables are a lot easier to explain if you're familiar with those.
Quote:
Does that mean it's used for games that scroll horizontally or not?
Vertical mirroring is used by some games that scroll horizontally, such as the original Super Mario Bros. and Super Mario Bros. 2.
Quote:
Also, why doesn't one need ".org $0000" at the beginning of the header? Is it implied?
ca65 delegates placement of code in the binary image to the linker.
Quote:
You wouldn't get a valid iNES header.
Ok. Give me some time to figure out the linker...
Quote:
Usually your link script template will specify how much PRG ROM it'll create. For example, an NROM-128 template will always make 16400 bytes (16 bytes of header and 16384 bytes of PRG ROM), an NROM-256 or CNROM template will always make 32784 bytes (16 bytes of header and 32768 bytes of PRG ROM), and a template for UNROM, BNROM, or ANROM will probably make 131088 bytes (16 header, 131072 PRG ROM).
Ok.
Quote:
They turn the page in the ROM, so to speak. Without them, the NES can't see more than 32 KiB of program and 8 KiB of graphic tiles. Some mappers have smaller "pages", which lets them have more than one active at once, and some have extra timing circuitry to make a few scrolling tricks easier.
Ok.
Quote:
In List of background topics, please see Text mode and Text user interface. The nametables are a lot easier to explain if you're familiar with those.
Quote:
Vertical mirroring is used by some games that scroll horizontally, such as the original Super Mario Bros. and Super Mario Bros. 2.
I've read up on nametables and mirroring, and I think I understand what they are, and what kind of games would use different types of the latter.
As a side note, looking at
http://tuxnes.sourceforge.net/nesmapper.txt , it gets Dragon Warrior 1, 3, and 4 wrong. I've just fired those games up myself with FCEUX and they all have vertical mirroring. Also, interestingly, Double Dragon 2 has a vertical mirroring for it's intro/title screen, but horizontal mirroring for everything else.
Quote:
ca65 delegates placement of code in the binary image to the linker.
I see. All this makes me wonder what nbasic is programmed to do...
FinalZero wrote:
As a side note, looking at
http://tuxnes.sourceforge.net/nesmapper.txt , it gets Dragon Warrior 1, 3, and 4 wrong.
The mirroring settings in the iNES header are meant for games with hardwired mirroring, and are meaningless in games that have mapper-controlled mirroring. Games that use mappers like MMC1 or MMC3 can change the type of mirroring at any time.
Quote:
The mirroring settings in the iNES header are meant for games with hardwired mirroring, and are meaningless in games that have mapper-controlled mirroring. Games that use mappers like MMC1 or MMC3 can change the type of mirroring at any time.
Oh, ok. That still begs to ask why they didn't match the iNES header mirroring value with the one actually used in the game though...
But, am I right in assuming that the part of the header that says how many prg and chr rom sections there are? That is, the mapper specifies only a range, not the specific amount?
FinalZero wrote:
Oh, ok. That still begs to ask why they didn't match the iNES header mirroring value with the one actually used in the game though...
I bet most dumpers would just guess the mirroring when filling in iNES headers until the games worked, and no matter what they guessed, games with mapper-controlled mirroring would always work.
But even if anyone cared about this, lots of games change the mirroring type at different times, and it would be very tedious to play through entire games just to make sure they use only one kind of mirroring all the way to the end, for the sole purpose of setting a bit in the header that doesn't really do anything.
Quote:
But, am I right in assuming that the part of the header that says how many prg and chr rom sections there are? That is, the mapper specifies only a range, not the specific amount?
Each mapper has a maximum amount of PRG and CHR it can handle, and these limits are usually specified in detailed documents about them. The values in the iNES header count actual "pages", which are 16KB large in the case of PRG and 8KB in the case of CHR. These sizes were used because they were thought to be the smallest in commercial NES games (this assumption was wrong though, as there is at least one game with 8KB of PRG, and this game has to be doubled up in order to be correctly represented in the iNES format).
Since the sizes of memory chips are always powers of 2 (32KB, 64KB, 128KB, 256KB, etc) it's best that your PRG and CHR sections are like that too, so you should never use a weird iNES configuration like 3 PRG pages (48KB).
Quote:
I bet most dumpers would just guess the mirroring when filling in iNES headers until the games worked, and no matter what they guessed, games with mapper-controlled mirroring would always work.
But even if anyone cared about this, lots of games change the mirroring type at different times, and it would be very tedious to play through entire games just to make sure they use only one kind of mirroring all the way to the end, for the sole purpose of setting a bit in the header that doesn't really do anything.
I'd expect them to at least fire up the rom and check though. But you're right, it's not really needed.
Quote:
Each mapper has a maximum amount of PRG and CHR it can handle, and these limits are usually specified in detailed documents about them. The values in the iNES header count actual "pages", which are 16KB large in the case of PRG and 8KB in the case of CHR. These sizes were used because they were thought to be the smallest in commercial NES games (this assumption was wrong though, as there is at least one game with 8KB of PRG, and this game has to be doubled up in order to be correctly represented in the iNES format).
Since the sizes of memory chips are always powers of 2 (32KB, 64KB, 128KB, 256KB, etc) it's best that your PRG and CHR sections are like that too, so you should never use a weird iNES configuration like 3 PRG pages (48KB).
What happens if there's more? It's just simply not available? Linker error?
I suppose mappers varied in price too, thus affecting how a development team chose which one to use.
-----
From
http://wiki.nesdev.com/w/index.php/MMC3 ...
Quote:
The MMC3 has 4 pairs of registers at $8000-$9FFF, $A000-$BFFF, $C000-$DFFF, and $E000-$FFFF - even addresses ($8000, $8002, etc.) select the low register and odd addresses ($8001, $8003, etc.) select the high register in each pair.
I don't understand what it means by "in each pair".
So, in addition to setting the header correctly, one must set things at the address in the mapper correctly too, in order for a game to run?
Also, it says:
Quote:
Mirroring ($A000-$BFFE, even)
Does it reserve all of $A000 to $BFFE to say what the mirroring is? Why can't it just use a couple bits?? I think I'm misunderstanding something...
-----
And finally, do I have the following right?: The ca65 assembler creates an object file from assembly files, and then the ld65 linker creates a NES file from the object file?
-----
http://www.cc65.org/doc/ld65-2.html
Looking at the ld65 documentation, the -t (specifies target system) option and -C (specifies custom config file) option can't be used together. What kind of rom does the linker default to when "-t nes" is used?
FinalZero wrote:
What happens if there's more? It's just simply not available? Linker error?
More than what the mapper supports? I'm sure you can assemble very large ROMs, and some emulators will even run them without problems, but you will not be able to put them on real carts.
Quote:
I suppose mappers varied in price too, thus affecting how a development team chose which one to use.
Yes. Things like extra RAM and batteries also played an important part in the manufacturing cost of the cartridges.
Quote:
I don't understand what it means by "in each pair".
I think it's because most operations on the MMC3 are performed with two register writes: the first selects the operation and the second executes it. For example, to bankswitch a page of PRG-ROM you must first tell the mapper where in the addressing space the page will go, and then you tell it which page to put there.
Quote:
So, in addition to setting the header correctly, one must set things at the address in the mapper correctly too, in order for a game to run?
Yes, you must configure the mapper to make sure everything (mirroring, active PRG and CHR pages, etc) is as you expect.
Quote:
Does it reserve all of $A000 to $BFFE to say what the mirroring is? Why can't it just use a couple bits?? I think I'm misunderstanding something...
Because the 6502 relies on memory-mapped registers to communicate with different devices, when you write something to the mapper, the mapper has to decode the address in order to know which register you wrote to. The cart receives 15 address bits, and if you wanted a register to be accessible through a single memory location, the mapper would have to decode all 15 bits, increasing its complexity (and cost) unnecessarily. Since the mapper only has a few registers, it's easier to just decode a few bits and ignore the rest. As a side effect, the few registers are mirrored several times across the addressing space (the actual layout depends on which bits are decoded and which are ignored).
So, in the case you mentioned, it's not that the mapper needs a shitload of bytes just to configure the mirroring, it's just that the register that controls the mirroring can be accessed through any address in that range, but it's still just one register.
EDIT: I can't help you with the ca65 stuff.
Quote:
More than what the mapper supports? I'm sure you can assemble very large ROMs, and some emulators will even run them without problems, but you will not be able to put them on real carts.
Yeah, that's what I meant.
Quote:
I think it's because most operations on the MMC3 are performed with two register writes: the first selects the operation and the second executes it. For example, to bankswitch a page of PRG-ROM you must first tell the mapper where in the addressing space the page will go, and then you tell it which page to put there.
Quote:
Because the 6502 relies on memory-mapped registers to communicate with different devices, when you write something to the mapper, the mapper has to decode the address in order to know which register you wrote to. The cart receives 15 address bits, and if you wanted a register to be accessible through a single memory location, the mapper would have to decode all 15 bits, increasing its complexity (and cost) unnecessarily. Since the mapper only has a few registers, it's easier to just decode a few bits and ignore the rest. As a side effect, the few registers are mirrored several times across the addressing space (the actual layout depends on which bits are decoded and which are ignored).
So, in the case you mentioned, it's not that the mapper needs a shitload of bytes just to configure the mirroring, it's just that the register that controls the mirroring can be accessed through any address in that range, but it's still just one register.
Ok, I understand the reasoning, and how they're supposed to work now.
FinalZero wrote:
And finally, do I have the following right?: The ca65 assembler creates an object file from assembly files, and then the ld65 linker creates a NES file from the object file?
-----
http://www.cc65.org/doc/ld65-2.htmlLooking at the ld65 documentation, the -t (specifies target system) option and -C (specifies custom config file) option can't be used together. What kind of rom does the linker default to when "-t nes" is used?
On my Gentoo Linux development server (with cc65 installed):
Code:
djenkins@hera ~/code/nesyar $ find /usr/local -name "nes*"
/usr/local/lib/cc65/lib/nes.lib
/usr/local/lib/cc65/lib/nes.o
/usr/local/lib/cc65/joy/nes-stdjoy.joy
/usr/local/lib/cc65/asminc/nes.inc
/usr/local/lib/cc65/include/nes.h
/usr/local/share/doc/cc65/nes.cfg
The linker cnfig file is the last ("nes.cfg") file. It is fairly well documented internally. It implements a NES with no mapper (32K prog-rom), 8K char-rom, but adds the 8K of prog-ram ($6000 to $7fff). This linker file also allocates 2 pages of ram ($300 to $4ff) for cc65's internal use with 'C' code.
Personally, I would not use the default linker file, even if I were writing NES software in a mix of C and asm.
In a NES game with a mapper, using ca65, one would give each banked segment a different segment name. The linker will tell you when you try to put too much code (or data) into a segment. The linker will (optionally) produce a "map" file that shows where it put each segment, and list how much free space is in each segment. ld65 can also produce a "debugger" file which is much more detailed.
These files are easy to parse with perl or python. I wrote a small perl script that digests the map file to tell me hos much space I have left in the three segments that I care about (in my non bank-switched game):
Code:
djenkins@hera ~/code/nesyar $ cat tools/free-space.pl
#!/usr/bin/perl -w
# NesYar/tools/free_space.pl
# Analyze ld65 linker "map" file, print how much free space I have
# in various segments.
use strict;
use warnings;
use diagnostics;
my $segment_list = 0;
my $zp_end = 0;
my $kernel_end = 0;
my $vectors_start = 0;
my $data_end = 0;
while (<STDIN>) {
chomp;
my $line = $_;
$segment_list = 1 if ($line =~ m/^Segment list:$/);
next if (! $segment_list);
$segment_list = 0 if ($line =~ m/^Exports list:$/);
if ($line =~ m/^ZEROPAGE[\s]+([0-9A-F]+) ([0-9A-F]+) ([0-9A-F]+)$/) {
$zp_end = hex ($2);
} elsif ($line =~ m/^KERNEL[\s]+([0-9A-F]+) ([0-9A-F]+) ([0-9A-F]+)$/) {
$kernel_end = hex ($2);
} elsif ($line =~ m/^VECTORS[\s]+([0-9A-F]+) ([0-9A-F]+) ([0-9A-F]+)$/) {
$vectors_start = hex ($1);
} elsif ($line =~ m/^DATA[\s]+([0-9A-F]+) ([0-9A-F]+) ([0-9A-F]+)$/) {
$data_end = hex ($2);
}
}
print "\x1b[0;33m" unless ("$^O" eq "MSWin32");
print sprintf ("ZP avail: %5d bytes\n", 256 - $zp_end);
print sprintf ("DATA avail: %5d bytes\n", 2048 - $data_end);
print sprintf ("ROM avail: %5d bytes\n", $vectors_start - $kernel_end);
print "\x1b[0m" unless ("$^O" eq "MSWin32");
It produces output like this:
Code:
./tools/free-space.pl < ./nesyar.map
ZP avail: 16 bytes
DATA avail: 961 bytes
ROM avail: 7190 bytes
FinalZero wrote:
What happens if there's more? It's just simply not available? Linker error?
If you try to add more data than what the link script supports, you will get a linker error.
Quote:
I suppose mappers varied in price too, thus affecting how a development team chose which one to use.
If you want more detail beyond what tokumaru explained, look at
one of the few published NES game post-mortems. This one mentions mapper pricing: a scanline counter cost money (because it brought in MMC3), a dedicated switchable bank for DPCM samples cost money (because it also brought in MMC3), PRG RAM cost money (because it brought in a 6264 and at least MMC1), and a battery cost money on top of that.
Quote:
So, in addition to setting the header correctly, one must set things at the address in the mapper correctly too, in order for a game to run?
Yes. The first part of your startup code that sets up the mapper must be located in a fixed bank. For UNROM (mapper 2), this is $C000-$FFFF. For configurations of MMC3 and MMC6 (mappers 4, 118, and 119), this is $E000-$FFFF. For mappers with no fixed bank, such as A*ROM (mapper 7) or S*ROM (mapper 1), the first part of the code must be repeated at the same place in all banks.
Quote:
And finally, do I have the following right?: The ca65 assembler creates an object file from assembly files, and then the ld65 linker creates a NES file from the object file?
Yes, just like a typical C toolchain.
Quote:
http://www.cc65.org/doc/ld65-2.html
Looking at the ld65 documentation, the -t (specifies target system) option and -C (specifies custom config file) option can't be used together. What kind of rom does the linker default to when "-t nes" is used?
It defaults to whatever the built-in link script for NES uses. I believe this built-in link script corresponds to a board that doesn't actually exist (NROM with 8 KiB PRG RAM), though it could be created with circuitry similar to that used in Family BASIC.
Quote:
ca65 delegates placement of code in the binary image to the linker.
So then, does .org do anything at all in ca65 asm?
Quote:
If you try to add more data than what the link script supports, you will get a linker error.
Ok.
Quote:
If you want more detail beyond what tokumaru explained, look at one of the few published NES game post-mortems. This one mentions mapper pricing: a scanline counter cost money (because it brought in MMC3), a dedicated switchable bank for DPCM samples cost money (because it also brought in MMC3), PRG RAM cost money (because it brought in a 6264 and at least MMC1), and a battery cost money on top of that.
Thanks for the link.
Quote:
Yes. The first part of your startup code that sets up the mapper must be located in a fixed bank. For UNROM (mapper 2), this is $C000-$FFFF. For configurations of MMC3 and MMC6 (mappers 4, 118, and 119), this is $E000-$FFFF. For mappers with no fixed bank, such as A*ROM (mapper 7) or S*ROM (mapper 1), the first part of the code must be repeated at the same place in all banks.
Ok.
Quote:
Yes, just like a typical C toolchain.
Ok. It sounds stupid but I didn't realize it was so. It didn't help that other assemblers simply skip the linker step... With this in mind, my project assembles and links, and then plays in FCEUX without generating an error!, but the screen is blank and nothing happens... I don't have the linker set up correctly yet...
Quote:
It defaults to whatever the built-in link script for NES uses. I believe this built-in link script corresponds to a board that doesn't actually exist (NROM with 8 KiB PRG RAM), though it could be created with circuitry similar to that used in Family BASIC.
Interesting.
-----
@clueless: Thank you for the script. I don't know Perl (I know some Python, but have never written anything major in it), but I have to get around to learning it sometime...
Also, in your thread on the second page there's a link to
http://nesdev.com/bbs/viewtopic.php?t=2997 . Where is the patch at/available for download though?
-----
I don't have the linker set up completely yet, so my next push will focus on that, which is bound to generate more questions, which I'll dump here. Stay tuned!
-----
Thank you to all for answering all my questions so far, especially when the average internet board would eat somebody alive for something like this.
FinalZero wrote:
Quote:
ca65 delegates placement of code in the binary image to the linker.
So then, does .org do anything at all in ca65 asm?
According to
the manual, it turns on absolute code mode temporarily. It's one way of making overlay code intended to be copied to RAM before execution; the other (preferred?) way is to specify a segment in the link script whose load and run addresses differ.
Quote:
Quote:
Yes, just like a typical C toolchain.
Ok. It sounds stupid but I didn't realize it was so.
It acts like a C compiler because it's bundled with one
Quote:
Thank you to all for answering all my questions so far, especially when the average internet board would eat somebody alive for something like this.
Let me tell you part of why I
don't bite newbies: I want to help demonstrate the legality of NES emulators. To be legal under US law, a copying technology has to have a substantial noninfringing use. Debian (and hence Ubuntu) accepts NES emulators, but Fedora doesn't because someone on fedora-legal thinks the three dozen or so noninfringing ROMs on pdroms.de are not substantial compared to the hundreds of infringing ROMs in a typical GoodNES set. But every time we train an eager newbie to be an NES coder or artist, we potentially get one step closer to a substantial library of playable homebrew games.
FinalZero wrote:
Quote:
ca65 delegates placement of code in the binary image to the linker.
So then, does .org do anything at all in ca65 asm?
Yeah, it screws it up, don't use it. Seriously, I tried to use ".org" to create a very specific layout for my zero-page variables, to make viewing them in FCEUX's debugger easier. However, after using ".org" and ".reloc" a few times, the assembler produced unlinkable code. I don't remember the exact error. So I replaced my usage of ".org" with lots of ".aligns" and ".assert" and I hand-tune my variable declaration list whenever an assert fails. I also use lots of "structs", but I'm a C junkie.
Ex:
Code:
;; For ease of debugging, we want the "Yar" and "Quotile" at multiples of 16
.align 16
.assert (* = $30), error, "Zero-page layout"
qotile_obj: .tag OBJ
qotile_dat: .tag QOTILE
.align 16
.assert (* = $40), error, "Zero-page layout"
yar1_obj: .tag OBJ
bullet1_obj: .tag OBJ
FinalZero wrote:
Quote:
Yes, just like a typical C toolchain.
Ok. It sounds stupid but I didn't realize it was so. It didn't help that other assemblers simply skip the linker step... With this in mind, my project assembles and links, and then plays in FCEUX without generating an error!, but the screen is blank and nothing happens... I don't have the linker set up correctly yet...
If you wish, post your linker config somewhere and we'll take a look at it.
I also recommend that you enable the linker map file and debug file and review them. They will show you where the linker actually put stuff.
Also just load the ROM into FCEUX and look at it in the debugger. Are the three 6502 vectors ($fffa - $ffff) set properly?
You can use FCEUX's debugger to single-step through the execution of your ROM. FCEUX can also take as input a "symbol file", but I'm yet to write a tool to convert the ld65's ".deb" file into something that pleases FCEUX. When I do, I'll happily share it (but it will be in perl). Tepples might be willing to create one in python, or convert mine. I don't know python.
FinalZero wrote:
@clueless: Thank you for the script. I don't know Perl (I know some Python, but have never written anything major in it), but I have to get around to learning it sometime...
Sure! I believe in sharing and giving back to the community and helping anyone that genuinely wants it.
FinalZero wrote:
Also, in your thread on the second page there's a link to
http://nesdev.com/bbs/viewtopic.php?t=2997 . Where is the patch at/available for download though?
I think that you are slightly mistaken. I did not create that thread, I have no posts in it, and its only one page long.
I read that thread when it came out. I never noticed the bug in ca65. Since my code works fine, I've not bothered to patch. This is what I use:
Code:
djenkins@hera ~/code/nesyar $ ca65 --version
ca65 V2.13.2 - (C) Copyright 1998-2005 Ullrich von Bassewitz
FinalZero wrote:
Thank you to all for answering all my questions so far, especially when the average internet board would eat somebody alive for something like this.
Certainly. You are most welcome. We feed rude noobs to NovaYoshi. We keep the good ones.
One more suggestion: Read the ca65 and ld65 docs a few times. Just pick some small section and read about a feature. You might not use it right away, but you'll learn what is available for when you might need it.
You won't need all features either, so if something looks really bizarre, skip it.
If you need to, write your own run-time "assert" logic. When an assertion fails, execute the invalid instruction $02. This will "halt" the CPU. Then set a breakpoint inside FCEUX to trip on invalid opcodes (there is a checkbox for this).
Ex:
Code:
lda player_energy
cmp #MAX_ENERGY
bcc ok
.byte $02 ;; 6502 "KILL" opcode.
ok:
tepples wrote:
According to
the manual, it turns on absolute code mode temporarily. It's one way of making overlay code intended to be copied to RAM before execution; the other (preferred?) way is to specify a segment in the link script whose load and run addresses differ.
I had been curious about the load and run addresses, but have never tried to use them because I (wrongly) thought it had to do with the C compiler's startup code or something. How does that work, in practice?
I've often used .org like this:
Code:
outside_label:
.org $0700
inside_label:
nop
.reloc
lda outside_label,x
sta inside_label,x
jmp inside_label
But any kind of cleaner method would be nice to know, this is really useful when you have a lot of RAM. I know I've been doing it "wrong", but haven't needed to change that yet.
The "load address" is where the code is placed in the ROM, and the "run address" is where you're expected to copy it in RAM before running it.
That makes sense, I guess I was thinking there was more to it. So you would put "outside_label" and "inside_label" in different segments, making sure the linker outputs it in that order. That's pretty easy.
Quote:
It acts like a C compiler because it's bundled with one
Oh, I see; cc65 uses that linker, so they hooked ca65 into it also.
Quote:
Let me tell you part of why I don't bite newbies: I want to help demonstrate the legality of NES emulators. To be legal under US law, a copying technology has to have a substantial noninfringing use. Debian (and hence Ubuntu) accepts NES emulators, but Fedora doesn't because someone on fedora-legal thinks the three dozen or so noninfringing ROMs on pdroms.de are not substantial compared to the hundreds of infringing ROMs in a typical GoodNES set. But every time we train an eager newbie to be an NES coder or artist, we potentially get one step closer to a substantial library of playable homebrew games.
I see, though I can't promise I'll turn out any games.
Quote:
Yeah, it screws it up, don't use it. Seriously, I tried to use ".org" to create a very specific layout for my zero-page variables, to make viewing them in FCEUX's debugger easier. However, after using ".org" and ".reloc" a few times, the assembler produced unlinkable code. I don't remember the exact error. So I replaced my usage of ".org" with lots of ".aligns" and ".assert" and I hand-tune my variable declaration list whenever an assert fails. I also use lots of "structs", but I'm a C junkie.
Ok.
Quote:
I also recommend that you enable the linker map file and debug file and review them. They will show you where the linker actually put stuff.
Ok, I've enabled them in the .bat file that assembles and links everything.
Quote:
Also just load the ROM into FCEUX and look at it in the debugger. Are the three 6502 vectors ($fffa - $ffff) set properly?
Nope, they're all zero.
Quote:
I think that you are slightly mistaken. I did not create that thread, I have no posts in it, and its only one page long.
I meant "your" not as in that thread, but as in this thread:
http://nesdev.com/bbs/viewtopic.php?p=37306 , which contains a link to that thread.
Quote:
I never noticed the bug in ca65.
It's not a bug in ca65. It's that one forgets to prepend the # to the beginning of an immediate value, thus instead using the value at an address, which might sometimes work and sometimes not.
Quote:
Certainly. You are most welcome. We feed rude noobs to NovaYoshi. We keep the good ones.
This NovaYoshi sounds scary. I don't recall reading any posts by him.
Quote:
If you wish, post your linker config somewhere and we'll take a look at it.
Ok, here's what I have:
Code:
# Linker
#---------------------------------------------------------------------------
MEMORY {
# .nes Header
HEADER: start = $0000, size = $0010, file = %O, define = no;
# fix
}
#---------------------------------------------------------------------------
SEGMENTS {
HEADER: load = HEADER, type = ro, define = no;
# fix
}
#---------------------------------------------------------------------------
I get an error that says: """ld65.exe: Error: Linker.ld65(1): Block identifier expected""". If I remove the beginning comment lines, it gives: """ld65.exe: Error: Linker.ld65(1): `}' expected""" instead. Oddly, I'm not quite sure how I got it to work/link before this error started happening. Maybe I was still using the default linker, I don't remember.
-----
A final question: Say I include a file that has some scopes with variables and/or enumerations. I shouldn't need to define it as a segment and link it, right? But what if the file included a procedure? or a macro?
So, does anyone know why my linker code fails? If it's for some simple, stupid reason, please just yell at me and tell me why...
On first glance, your linker config file looks syntacticly correct.
If you wish, place a zip file (or tar ball if your on unix) someplace where I can download it (pm me if you want to keep the url a secret) and I'll take a really close look at it for you.
clueless wrote:
On first glance, your linker config file looks syntacticly correct.
If you wish, place a zip file (or tar ball if your on unix) someplace where I can download it (pm me if you want to keep the url a secret) and I'll take a really close look at it for you.
There's nothing else to look at though. I posted the entirety of the linker file.
Edit: You're response made me uneasy because it was unexpected. So I made a new linker file and literally just copied and pasted the code from the old file into the new one, and it links like it's supposed to. This time I get a """Missing memory area assignment for segment 'CODE'""" error, which rather obviously points out what is wrong.
One of the things that I was going to look for in your raw file was the inclusion of any non-printable characters. The parser for the linker will find them, even if your editor hides them.
What I meant by posting a zip file (I should have been clearer) was a zip file containing your entire project, not just the linker file. I did not realize that was your entire project to date. I assumed that your linker file started out with more segments, you got the error, and then reduced the problem space down to the smallest that it could be and still exhibit the defect, then posted.
cc65, ca65, ld65 have a steeper learning curve than the other common nesdev tool kits. But the cc65 project provides much more power and control over your final binary. I think that you'll do fine if you keep at it.
I stumbled through these frustrations myself. Writing the assembly was ok, as I already knew 6502 assembly and was familiar with various assembler syntaxes. For me, the hard part was getting the linker to arrange my pieces where I wanted them. Then one day it just clicked.
I used to use cc65 to cross compile code for the Apple IIe years ago. Each test cycle I would send my program over a 57600 serial null modem cable using zmodem to a real Apple IIe.
Quote:
Posted: Wed Jan 12, 2011 8:38 pm Post subject:
One of the things that I was going to look for in your raw file was the inclusion of any non-printable characters. The parser for the linker will find them, even if your editor hides them.
Maybe that was it; I can't check because I've already deleted/exorcised the old linker file.
Ok then, so how do I know where to put the CODE segment? Is this what's called the PRG ROM bank? I'm trying to use MMC3 (
http://wiki.nesdev.com/w/index.php/MMC3 ).
You will probably have many code segments. One in each bank, at least.
Also, some people like to separate "read only data" (think internal data tables) from "code". You don't need to do this, but you may choose to if you wish.
I do not have an MMC3 example, but I do have one MMC1 example and one NROM example. The first one is from my first nesdev attempt, back in 2008.
MMC1 provides for 16 16K rom banks (well, it might do more, i don't remember. My game wanted to use 16 of them). The top bank is not switched, and is always at $c000 to $ffff (the MMC1 allows other configs, but this is the one that I choose). The permanent segment is called KERNEL (see very bottom of my linker.cfg).
The other "ROM_0" through "ROM_E" are all bank-switched at $8000 to $bfff. I decided to put the bank number at $bfff in each bank, so that code running in KERNEL can push/pop the active bank during an NMI, should it need to bankswitch to handle the NMI. That is the "ROM_x_MARKER" segment. You can ignore those.
The linker section "SEGMENTS" defines the order that the segments are layed out in the ROM file. That is why "HEADER" is first. "ZEROPAGE", "PLAYER" and "DATA" all define RAM, so those are not stored in the final binary. However, the linker needs to know about them so that it can assign addresses and do the "address fixups" when emitting the actual binary.
Note that my MMC1 example uses CHAR-RAM, so you don't see a CHAR-ROM in the SEGMENTS or MEMORY tables.
https://www.ecoligames.com/trac/nes-gam ... linker.cfg
My second example, NROM:
Below is the linker file for my Yars' Revenge game. This game DOES use char-rom, but no bank-switching. prog-rom is 16k, char-rom is 8k. Note that in this config, I define four memory ranges that all get linked into the same segment: BRAND, RODATA, KERNEL and VECTORS. The linker puts them into the same address space, in the segment called "KERNEL". I just choose to use my own segment names instead of the ld65 defaults. When you see "KERNEL", think "CODE", so my examples might be a bit confusing.
I also split the NES's 2K of ram into different memory regions, and I use the linker config file to assign these memory regions to absolute addresses. QSHIELD is a 128 byte buffer of tiles to be blitted to the name-table, 2 columns 16 bytes tall, every NMI (the entire translated QS is stored in ram). NZONE contains the neutral zone tiles that are to be blitted to the name-table, 2 columns 30 tiles tall, every NMI (but only two columns are stored in RAM). I didn't have to do it this way.
I'm showing this because I think that it help illustrate the flexibility of what the linker can do for you.
Code:
# NesYar/src/linker.cfg
# http://www.cc65.org/doc/ld65-5.html
MEMORY {
ZP: start = $0000, size = $0100, type = rw, define = no;
OAMBUF: start = $0200, size = $0100, type = rw, define = no;
QSHIELD: start = $0300, size = $0080, type = rw, define = no;
NZONE: start = $0380, size = $0020, type = rw, define = no;
RAM: start = $0400, size = $0400, type = rw, define = no;
HEADER: start = $0000, size = $0010, file = %O, fill = yes, define = no;
BRAND: start = $c000, size = $0100, file = %O, fill = yes, define = no, fillval = $00;
KERNEL: start = $c100, size = $3ef6, file = %O, fill = yes, define = no, fillval = $00;
VECTORS: start = $fff6, size = $000a, file = %O, fill = yes, define = no;
CHARROM: start = $0000, size = $2000, file = %O, fill = no, define = no;
}
SEGMENTS {
HEADER: load = HEADER, type = ro, define = no;
BRAND: load = BRAND, type = ro, define = no;
RODATA: load = KERNEL, type = ro, define = no, align = $100;
KERNEL: load = KERNEL, type = ro, define = no;
VECTORS: load = VECTORS, type = ro, define = no;
CHARROM: load = CHARROM, type = ro, define = no;
ZEROPAGE: load = ZP, type = zp, optional = yes, align = $100;
OAMBUF: load = OAMBUF, type = bss, define = no;
DATA: load = RAM, type = bss, define = no, align= $100;
NZONE: load = NZONE, type = bss, define = no;
QSHIELD: load = QSHIELD, type = bss, define = no;
}
FILES {
%O: format = bin;
}
btw, the 6502 only has 3 vectors. You'll notice that my linker config allocated 10 bytes instead of 6. I use the extra four bytes to store the assembler time stamp of when the build was made, thusly:
Code:
.segment "VECTORS"
build_ts:
.dword .time
.word sys_nmi
.word sys_reset
.word sys_irq
And I include my char-rom like this:
Code:
.segment "CHARROM"
.incbin "src/charrom.chr", 0, 8192
And the header:
Code:
.segment "HEADER"
.byte $4e,$45,$53,$1a ; header magic
.byte $01 ; # 16K prog roms
.byte $01 ; # 8K char roms
.byte %00000001 ; Vertical mirroring, NROM, no SRAM
.byte %00000000 ; NES mapper upper nibble
.byte 0,0,0,0,0,0,0,0 ; filler
I just noticed that in my 2008 MMC1 example, I defined the "ROM_x_MARKERS" in the "MEMORY" section of the linker (this is used when doing "address fixups" when emitting the actual binary.
But I never used them in the "SEGMENTS", so the linker never should have emitted those magic bytes to the binary. Either I mis-remembered something and have given you a bad example, or my example is slightly flawed.
Either way, I hope that my exmaples help you understand the relationship between using ".segment" inside your asm source, and what the linker can do with those segments.
Quote:
You will probably have many code segments. One in each bank, at least.
Ok, but how do I decide what to put in each segment? How do I decide when to start a new segment? How do I decide what to put in each bank? How do I decide how large I want each rombank to be? How do I decide how many of them I want?
Quote:
Note that my MMC1 example uses CHAR-RAM, so you don't see a CHAR-ROM in the SEGMENTS or MEMORY tables.
I know what RAM and ROM is, but what does this mean?
Quote:
The linker section "SEGMENTS" defines the order that the segments are layed out in the ROM file. That is why "HEADER" is first. "ZEROPAGE", "PLAYER" and "DATA" all define RAM, so those are not stored in the final binary. However, the linker needs to know about them so that it can assign addresses and do the "address fixups" when emitting the actual binary.
But how do we tell the linker that they're just RAM? Is that what the """type = rw""" signifies?
More Questions:
1) What is "char rom" supposed to stand for anyways? "character (font) rom"?
2) Is the
Code:
101 FILES {
102 %O: format = bin;
103 }
part really needed? Doesn't it do that by default?
3) What is "bss"?
FinalZero wrote:
I know what RAM and ROM is, but what does this mean?
Quote:
1) What is "char rom" supposed to stand for anyways? "character (font) rom"?
NES carts have at least 2 memory chips: one for the program (always ROM) and one for tiles/characters (can be ROM or RAM). If the CHR is ROM, when you assemble the NES file you have to put its contents at the end of the file, but if the CHR is RAM, the NES file doesn't have any tiles at the end. The tiles will instead be stored somewhere in the PRG-ROM (wherever you want, in whatever format you want, they can even be compressed to save space) and the program itself will have to copy/decompress them to CHR-RAM using $2006/$2007.
Sorry if I can't help with the assembler-specific stuff.
So "PRG-ROM" stands for program rom? Which means, like, the actual game code, while "CHR-ROM" means the stored graphics?
FinalZero wrote:
Quote:
You will probably have many code segments. One in each bank, at least.
Ok, but how do I decide what to put in each segment? How do I decide when to start a new segment? How do I decide what to put in each bank? How do I decide how large I want each rombank to be? How do I decide how many of them I want?
The size of a segment is dictated by the mapper that you are using. For MMC1 they will be 16K or 32K.
http://kevtris.org/mappers/mmc1/index.htmlAs the developer, you get to decide what you want to put where. The only "requirements" are
1) 6502 vectors (last 6 bytes of CPU address space) are mapped in when they could be invoked (and for reset, that is anytime), and that they point to the correct executable code.
2) If you want to use DCPM, your samples should be mapped into the CPUs address space. I've not experimented with DCPM yet, so I don't know the specifics.
I've read that the
Legand of Zelda uses the first bank for its sound driver and music data.
I know that
Crystalis (MMC3) uses its first few banks for storing all of the game's map data.
FinalZero wrote:
Quote:
Note that my MMC1 example uses CHAR-RAM, so you don't see a CHAR-ROM in the SEGMENTS or MEMORY tables.
I know what RAM and ROM is, but what does this mean?
The physical cart needs to provide the PPU with 8K of addresses that it can fetch from. This can be ROM or RAM (or rarely, a combination of both).
Crystalis uses multiple char-rom banks, and switches parts of them every few frames to achieve background animation, like waves crashing on the shore-line, or tall grasses waving in the wind.
Final Fantasy (MMC1) uses char-ram. I would speculate that the variety of monster party encounters rules out the possibility of storing all of the monsters in one or two char-rom banks. Plus the over-world map trick (B-Select when on the overworld) is done using char-ram, it is freaking cool!
FinalZero wrote:
Quote:
The linker section "SEGMENTS" defines the order that the segments are layed out in the ROM file. That is why "HEADER" is first. "ZEROPAGE", "PLAYER" and "DATA" all define RAM, so those are not stored in the final binary. However, the linker needs to know about them so that it can assign addresses and do the "address fixups" when emitting the actual binary.
But how do we tell the linker that they're just RAM? Is that what the """type = rw""" signifies?
I think so. I don't recall, I've not had to edit my linker config in a while.
FinalZero wrote:
More Questions:
1) What is "char rom" supposed to stand for anyways? "character (font) rom"?
2) Is the
Code:
101 FILES {
102 %O: format = bin;
103 }
part really needed? Doesn't it do that by default?
3) What is "bss"?
1) Yes. (CHAR|PROG) = Character (PPU) / Program (CPU) address space.
2) Yes, it is redundant. It was required by the redundant department of redundencies, and my pedantic need to explicitly state what I want in config file, and possibly by the voices in my cat's mind. Seriously, I don't know why I have it like that. I probably had specified other options there in the past, and then removed them, but left that section of the linker config.
3) The Wikipedia page can better articulate it than I can. Basically, BSS contains all non-heap global "data" that can changed at run-time. Memory reserved for BSS is typically NOT stored in a binary, and the process's "crt0" code (the stuff that the compiler vendor wrote, it calls your "main()") will initialize it to 0x00 for you before invoking main.
http://en.wikipedia.org/wiki/.bss
Quote:
DCPM
I have no idea what this is.
Quote:
The physical cart needs to provide the PPU with 8K of addresses that it can fetch from.
But what does this mean?
Quote:
I would speculate that the variety of monster party encounters rules out the possibility of storing all of the monsters in one or two char-rom banks.
So what do they do then?
... I don't think I understand the difference between a bank and a segment. Is a bank just part of a segment?
Quote:
3) The Wikipedia page can better articulate it than I can. Basically, BSS contains all non-heap global "data" that can changed at run-time. Memory reserved for BSS is typically NOT stored in a binary, and the process's "crt0" code (the stuff that the compiler vendor wrote, it calls your "main()") will initialize it to 0x00 for you before invoking main.
Ok, I understand bss in the context of C, but how does it relate to 6502 asm? Do all the variables have to stored there or something?
FinalZero wrote:
Quote:
DCPM
I have no idea what this is.
Me, misspelling DPCM. Delta Pulse Code Modulation. It is how the NES can play raw sound samples. The APU uses DMA to pull the samples from the high-end of the CPUs address space.
FinalZero wrote:
Quote:
The physical cart needs to provide the PPU with 8K of addresses that it can fetch from.
But what does this mean?
The PPU uses two 4K banks, one for background tiles, one for sprites. The PPU can be configured to pull tiles and sprites from the same 4K bank, or from either bank. Together, the PPU "sees" 8K ($0000 to $1fff in the PPU's address space). This memory lives on a cart as ROM or RAM and is called "CHAR-xxx". Physically, on the bus between the NES and the cart, the PPU controls 13 address lines (2^13 = 8192 = 8K bytes (well, technically, the ISO unit is "KiB", not "KB", but we'll ignore that here)) and reads 8 data bits. There are other control signals (RW, /CE) used to tell the memory circuit if the PPU wants to read or write, and "enable" the chip or not.
FinalZero wrote:
Quote:
I would speculate that the variety of monster party encounters rules out the possibility of storing all of the monsters in one or two char-rom banks.
So what do they do then?
I'm sorry. I don't understand your question. What does _what_ do? I suppose that my example won't make any sense to you if you have not played FF1.
FinalZero wrote:
... I don't think I understand the difference between a bank and a segment. Is a bank just part of a segment?
Its all semantics. The cc65 compiler suite uses the term "segment", which can be of arbitary length. The NES simply has address ranges and two buses (CPU and PPU). The term "bank switching" means that a memory controller (ie, mapper chip) can swap memory out from underneath the CPU. Banks are typically large, and have sizes that are exact powers of 2, and are aligned on memory address boundaries that are powers of 2.
FinalZero wrote:
Quote:
3) The Wikipedia page can better articulate it than I can. Basically, BSS contains all non-heap global "data" that can changed at run-time. Memory reserved for BSS is typically NOT stored in a binary, and the process's "crt0" code (the stuff that the compiler vendor wrote, it calls your "main()") will initialize it to 0x00 for you before invoking main.
Ok, I understand bss in the context of C, but how does it relate to 6502 asm? Do all the variables have to stored there or something?
In a modern computer, the programs "data segment" is copied to ram. "RODATA" and "TEXT" (an archaic term for "code") are marked read-only (if the cpu's mmu supports that). "Data" is loaded just above the end of the "text", and BSS above that. The app's heap starts above BSS and grows up, and the stack starts at the end of the processes address space and grows down. When the heap and stack collide, bad shit happens.
In the NES is bit different. There is no "program loader" that places different segments into memory. Your reset routine _IS_ the loader, so to speak. Typically, a NES program will contain CODE and RODATA in ROM; use all RAM as BSS (but its not called that) and contain no traditional "DATA" segment. However, there is nothing to stop you from emitting a large chunk of "DATA" into your ROM and then copying it into RAM, where it can then be modified and treated as "DATA" like a C program executing on Unix would.
Quote:
Me, misspelling DPCM. Delta Pulse Code Modulation. It is how the NES can play raw sound samples. The APU uses DMA to pull the samples from the high-end of the CPUs address space.
Ok.
Quote:
The PPU uses two 4K banks, one for background tiles, one for sprites. The PPU can be configured to pull tiles and sprites from the same 4K bank, or from either bank. Together, the PPU "sees" 8K ($0000 to $1fff in the PPU's address space). This memory lives on a cart as ROM or RAM and is called "CHAR-xxx". Physically, on the bus between the NES and the cart, the PPU controls 13 address lines (2^13 = 8192 = 8K bytes (well, technically, the ISO unit is "KiB", not "KB", but we'll ignore that here)) and reads 8 data bits. There are other control signals (RW, /CE) used to tell the memory circuit if the PPU wants to read or write, and "enable" the chip or not.
Ok. While we're on the topic of tiles and sprites, is there any way to just grab and display a 4x4 tile/sprite? or is 8x8 the smallest size that can be done? I'm wondering because if so, then smoother fonts could be created and used, rather than the typical ugly monospaced 8x8 one...
Quote:
I'm sorry. I don't understand your question. What does _what_ do? I suppose that my example won't make any sense to you if you have not played FF1.
I have played FF1, but not for the NES. I'm trying to ask how they do store the data if they can't store it in one or two char banks. Also, do you mean the monster graphics themselves, or the possible combinations of monster the player's party might meet? If the latter, why is that stored in char rom? It's not graphical.
Quote:
Its all semantics. The cc65 compiler suite uses the term "segment", which can be of arbitary length. The NES simply has address ranges and two buses (CPU and PPU). The term "bank switching" means that a memory controller (ie, mapper chip) can swap memory out from underneath the CPU. Banks are typically large, and have sizes that are exact powers of 2, and are aligned on memory address boundaries that are powers of 2.
So, what does one do if one needs data from two separate banks? Load it from one, switch to the other, and then load the rest?
Quote:
In a modern computer, the programs "data segment" is copied to ram. "RODATA" and "TEXT" (an archaic term for "code") are marked read-only (if the cpu's mmu supports that). "Data" is loaded just above the end of the "text", and BSS above that. The app's heap starts above BSS and grows up, and the stack starts at the end of the processes address space and grows down. When the heap and stack collide, bad shit happens.
In the NES is bit different. There is no "program loader" that places different segments into memory. Your reset routine _IS_ the loader, so to speak. Typically, a NES program will contain CODE and RODATA in ROM; use all RAM as BSS (but its not called that) and contain no traditional "DATA" segment. However, there is nothing to stop you from emitting a large chunk of "DATA" into your ROM and then copying it into RAM, where it can then be modified and treated as "DATA" like a C program executing on Unix would.
Ok, so "RODATA" is read-only data (in the context of FF) like monster stats (which change), "DATA" is data like character stats (which don't change), "CODE" is like the battle/menu routine, and "BSS" is the ram? And these are only conventions suggested to be followed, not laws that the assembler enforces?
FinalZero wrote:
Ok. While we're on the topic of tiles and sprites, is there any way to just grab and display a 4x4 tile/sprite? or is 8x8 the smallest size that can be done?
8x8 is the smallest unit. For sprites you can make part of the 8x8 tile transparent so objects can appear to be any size you want, but the backgrounds are always composed by 8x8-pixel blocks.
Quote:
I'm wondering because if so, then smoother fonts could be created and used, rather than the typical ugly monospaced 8x8 one...
If you use CHR-RAM you can manipulate graphics to the pixel level, but because of space and speed constraints, there are severe limitations to what you can do with that.
To write text with proportional fonts you have to use CHR-RAM, reserve a number of tiles that span the maximum length of your text, and in order to write the text you'd update the pattern tables, rather than the name tables. Since there are only 256 tiles for the background, you can conclude that this will waste a lot of tiles quickly.
Under normal circumstances, those 256 tiles are not enough to make an entirely unique screen (such screen would need 960 tiles). But if your graphics use only 2 colors (which is usually the case with text), you can use a few tricks. First, NES tiles have 4 colors, so they use 2 bits per pixel. 2-color graphics need only 1 bit per pixel, so you can actually store 2 1-bit images in 1 tile (to display one image or the other you have to use different palettes), which basically doubles the tile count to 512. The next trick is to switch to the other half of the pattern tables halfway through the rendering of the screen, which will result in 1024 1-bit tiles, enough for an unique screen with no repeated tiles.
So yeah, it's possible to write a whole screen of text with proportional fonts. In fact, there was
some talk about this a while ago when someone wanted an e-book reader made for his portable NES clone. I don't think this would be practical in actual games though.
Quote:
I have played FF1, but not for the NES. I'm trying to ask how they do store the data if they can't store it in one or two char banks. Also, do you mean the monster graphics themselves, or the possible combinations of monster the player's party might meet? If the latter, why is that stored in char rom? It's not graphical.
I haven't played FF1, but the thing with CHR-ROM is that the tile combinations you can make are limited. The maximum CHR division in existing mappers is 4 1KB banks per pattern table. This means that there are 4 switchable blocks. If you wanted to combine enemies for example, you'd have to map a different enemy to each slot. But what happens if you want to combine more than 4 different enemies? You can't with blocks that size, unless you start hardcoding the combinations in a huge CHR-ROM chip, which might not even be possible depending on the number of combinations. The solution in this case is to use CHR-RAM, so that each tile can be edited freely and you can make all the combinations of graphics you want, as long as they add up to 256 tiles.
Quote:
So, what does one do if one needs data from two separate banks? Load it from one, switch to the other, and then load the rest?
Yeah, this can get complicated depending on the number of switchable and fixed banks each mapper has. This is something you have to take into consideration when designing your program.
In my current project for example (UxROM: 2 16KB slots, one fixed, one switchable), the main game engine is in the fixed bank, so that it can read level maps and things like that from all the other banks. Other things that are more or less self contained (i.e. don't need to access other banks) stay in switchable banks too (like the music engine, navigation menus/screens, etc).
Of course there are ways to access anything from anywhere by using trampoline code (i.e. call a routine in the fixed bank which will switch banks, read the data, switch back and finally return the data), but that can be very slow because of all the overhead.
I wasn't thinking of a proportional font though, but only a font that *seems* proportional, when in fact it has a fixed-width of 4. So a letter like 'i' would take up 1x2 (1 wide, 2 tall) 4x4 pixel tiles, but 'm' or 'w' would take 3x2. But this seems to be impossible since you said the smallest unit is 8x8. ... I remember that the fan translation of FF3j made some special tiles so that three skinny letters ('i' or 'l') could fit across two tiles instead of three. Of course, the problem with this is that it takes up more tiles, that one can only choose a limited amount of combinations, and it's practical uses are limited to skinny letters like 'i' and 'l', or shortening specific long words. ... Was/is the SNES simply fast enough that more games than before decided to use proportional fonts?
Quote:
Under normal circumstances, those 256 tiles are not enough to make an entirely unique screen (such screen would need 960 tiles). But if your graphics use only 2 colors (which is usually the case with text), you can use a few tricks. First, NES tiles have 4 colors, so they use 2 bits per pixel. 2-color graphics need only 1 bit per pixel, so you can actually store 2 1-bit images in 1 tile (to display one image or the other you have to use different palettes), which basically doubles the tile count to 512.
I understand how that works, but at the same time I'm not sure how that'd work. Wouldn't you have to clear either the upper or lower bits depending or whether your reading the tile or its shadow?
Quote:
Yeah, this can get complicated depending on the number of switchable and fixed banks each mapper has. This is something you have to take into consideration when designing your program.
In my current project for example (UxROM: 2 16KB slots, one fixed, one switchable), the main game engine is in the fixed bank, so that it can read level maps and things like that from all the other banks. Other things that are more or less self contained (i.e. don't need to access other banks) stay in switchable banks too (like the music engine, navigation menus/screens, etc).
Of course there are ways to access anything from anywhere by using trampoline code (i.e. call a routine in the fixed bank which will switch banks, read the data, switch back and finally return the data), but that can be very slow because of all the overhead.
How much does bank switching slow things down?
...
If nobody minds, I'm going to try posting some code here so we can discuss what exactly I need to program next to work my way toward displaying "Hello World". Perhaps uploading the files somewhere would be a better idea?
-----
Also, is there any way to raise the number of posts that are visible per page on this board?
FinalZero wrote:
Was/is the SNES simply fast enough that more games than before decided to use proportional fonts?
It's significantly faster when it comes to updating tiles (I believe it even has DMA for that), and it also has much more VRAM.
Quote:
I understand how that works, but at the same time I'm not sure how that'd work. Wouldn't you have to clear either the upper or lower bits depending or whether your reading the tile or its shadow?
Conveniently enough, NES tiles are stored in planes, 8 bytes for the fist plane followed by 8 bytes for the second, so if you open them in a tile editor an set the format to 1-bit, what you'll see is that each NES tile becomes 2 1-bit tiles (without any conversion necessary, the difference is just how the data is interpreted). Pixels can have values ranging from 0 to 3. If a pixel is 0, it's cleared in both planes, if it's 3, it's set in both planes, and if it's 1 or 2 it's set in only one of the planes. That tells you how to configure your palettes so that one image or the other is displayed. Say you want to have black text on white background. Color 0 would always be white, because that's the color that will be picked when both planes have clear pixels. Similarly, color 3 will always be black. As for colors 1 and 2, one will be white and the other will be black, depending on which plane you want to show. Setting one of those colors to white will effectively "hide" one of the planes, so that only the other one is visible.
Quote:
How much does bank switching slow things down?
It depends. You can read a single byte from PRG-ROM in 4 cycles with LDA $8040, for example. If you wanted to load a byte from an arbitrary bank, you'd have to switch that bank in first, which would take at least 8 cycles in an UxROM game, so it would take you a total of 12 cycles, 3 times more, to read a single byte. Depending on the mapper, switching a new bank might take more time, and if you want to do things like remember the previous bank and switch it back after reading the data, the overhead will be huge (of course, the overhead will not be as significant if you are reading 200 bytes, as opposed to only one).
So trust me, reading lots of data from banks all across the ROM is not something you want to do in the middle of your game loop, because there will be a huge impact on the speed. Try to organize your code and data in a way that minimizes bankswitching as much as possible. To give you an idea, it would be acceptable to bankswitch a couple dozen times every frame (i.e. each iteration of your game logic), but if you are doing it hundreds of times per frame, there's probably something wrong.
Quote:
If nobody minds, I'm going to try posting some code here so we can discuss what exactly I need to program next to work my way toward displaying "Hello World". Perhaps uploading the files somewhere would be a better idea?
People often post code in sites like
this. If you have a number of source files though, it would probably be better to upload a ZIP file somewhere.
Quote:
Also, is there any way to raise the number of posts that are visible per page on this board?
I never actually looked for an option to do this, but I don't think this is possible.
FinalZero wrote:
Ok. While we're on the topic of tiles and sprites, is there any way to just grab and display a 4x4 tile/sprite? or is 8x8 the smallest size that can be done? I'm wondering because if so, then smoother fonts could be created and used, rather than the typical ugly monospaced 8x8 one...
It's possible to render proportional text to CHR RAM. A box for three lines of a character's dialogue might take up 60 of your background tiles (or 30 if you use mid-frame palette hackery), but then those are tiles you don't have to use for 0-9A-Za-z. You'll need it anyway if you plan to produce a version of your game containing Chinese, Arabic, or other complex scripts. Faxanadu, for example, reserves 64 tiles for its text boxes so that the Japanese version can display kanji.
Quote:
I'm trying to ask how they do store the data if they can't store it in one or two char banks. Also, do you mean the monster graphics themselves, or the possible combinations of monster the player's party might meet?
The latter. With CHR ROM, one can't arbitrarily display from different parts of CHR ROM; one is limited by the number of pages that can be selected at once. But with CHR RAM, the program can copy any combination of monster graphics into memory at once.
Quote:
So, what does one do if one needs data from two separate banks? Load it from one, switch to the other, and then load the rest?
If you're referring to PRG data, then for the most part, yes. But with quite a few mappers, when you switch to a bank with data, you also switch right under your own feet because the program itself is in a switchable bank. So there are pieces of the program that have to be at the same place in all banks. This is the "trampoline" that tokumaru was talking about, which lets subroutines in different banks call each other.
And as for rapid fire bankswitching, there are exceptions to nearly every rule. Cosmic Epsilon, for example, abuses the PPU, MMC, and CHR ROM to act together as a makeshift texture mapping unit, and it switches on every scanline. But in more typical use, MMC1 is the slowest to switch banks, taking 9 instructions; it's far faster on most other mappers (2 on most discretes; 3 on MMC3).
Quote:
Ok, so "RODATA" is read-only data (in the context of FF) like monster stats (which change), "DATA" is data like character stats (which don't change), "CODE" is like the battle/menu routine, and "BSS" is the ram? And these are only conventions suggested to be followed, not laws that the assembler enforces?
DATA is anything copied into RAM at reset time. You shouldn't have much need for this, as things would usually get copied at the start of a game, the start of a level, etc.
tepples wrote:
Quote:
Ok, so "RODATA" is read-only data (in the context of FF) like monster stats (which change), "DATA" is data like character stats (which don't change), "CODE" is like the battle/menu routine, and "BSS" is the ram? And these are only conventions suggested to be followed, not laws that the assembler enforces?
DATA is anything copied into RAM at reset time. You shouldn't have much need for this, as things would usually get copied at the start of a game, the start of a level, etc.
However it should be noted that some CC65 linker configs out there use DATA as the bss segment name, as CC65 doesn't force any of the segment parameters.
CODE = read-only code
RODATA = read-only data, goes in to ROM, so you can't modify it
BSS = uninitialized data, this is RAM that you don't care what the initialization value is (but it's usually cleared to 0 on reset)
DATA = initialized data (in RAM), the initialization values are copied from ROM to RAM on reset, however you have to do that YOURSELF using symbols that the linker supplies
Quote:
Conveniently enough, NES tiles are stored in planes, 8 bytes for the fist plane followed by 8 bytes for the second, so if you open them in a tile editor an set the format to 1-bit, what you'll see is that each NES tile becomes 2 1-bit tiles (without any conversion necessary, the difference is just how the data is interpreted). Pixels can have values ranging from 0 to 3. If a pixel is 0, it's cleared in both planes, if it's 3, it's set in both planes, and if it's 1 or 2 it's set in only one of the planes. That tells you how to configure your palettes so that one image or the other is displayed. Say you want to have black text on white background. Color 0 would always be white, because that's the color that will be picked when both planes have clear pixels. Similarly, color 3 will always be black. As for colors 1 and 2, one will be white and the other will be black, depending on which plane you want to show. Setting one of those colors to white will effectively "hide" one of the planes, so that only the other one is visible.
I understand it now. Wouldn't a downside be though that one needs to devote 2 palettes to displaying text if one needs tiles from both sets though?
Quote:
It depends. You can read a single byte from PRG-ROM in 4 cycles with LDA $8040, for example. If you wanted to load a byte from an arbitrary bank, you'd have to switch that bank in first, which would take at least 8 cycles in an UxROM game, so it would take you a total of 12 cycles, 3 times more, to read a single byte. Depending on the mapper, switching a new bank might take more time, and if you want to do things like remember the previous bank and switch it back after reading the data, the overhead will be huge (of course, the overhead will not be as significant if you are reading 200 bytes, as opposed to only one).
So trust me, reading lots of data from banks all across the ROM is not something you want to do in the middle of your game loop, because there will be a huge impact on the speed. Try to organize your code and data in a way that minimizes bankswitching as much as possible. To give you an idea, it would be acceptable to bankswitch a couple dozen times every frame (i.e. each iteration of your game logic), but if you are doing it hundreds of times per frame, there's probably something wrong.
Ah, ok.
Quote:
People often post code in sites like this. If you have a number of source files though, it would probably be better to upload a ZIP file somewhere.
I'll probably load it onto my webpage.
http://jc.tech-galaxy.com/ That way I can best control how long it's available for. I'll get around to uploading the files eventually.
Quote:
It's possible to render proportional text to CHR RAM. A box for three lines of a character's dialogue might take up 60 of your background tiles (or 30 if you use mid-frame palette hackery), but then those are tiles you don't have to use for 0-9A-Za-z. You'll need it anyway if you plan to produce a version of your game containing Chinese, Arabic, or other complex scripts. Faxanadu, for example, reserves 64 tiles for its text boxes so that the Japanese version can display kanji.
What do you mean by """mid-frame palette hackery"""?
Quote:
DATA is anything copied into RAM at reset time. You shouldn't have much need for this, as things would usually get copied at the start of a game, the start of a level, etc.
What would an example of such a thing be?
FinalZero wrote:
Quote:
Color 0 would always be white, because that's the color that will be picked when both planes have clear pixels. Similarly, color 3 will always be black. As for colors 1 and 2, one will be white and the other will be black, depending on which plane you want to show. Setting one of those colors to white will effectively "hide" one of the planes, so that only the other one is visible.
I understand it now. Wouldn't a downside be though that one needs to devote 2 palettes to displaying text if one needs tiles from both sets though?
Using two palettes is excusable if you use mid-frame palette rewrites.
Quote:
Quote:
It's possible to render proportional text to CHR RAM. A box for three lines of a character's dialogue might take up 60 of your background tiles (or 30 if you use mid-frame palette hackery)
What do you mean by """mid-frame palette hackery"""?
The palette (PPU $3F00-$3F1F) can be modified whenever rendering is turned off (forced blank or vertical blank). If you do this during draw time (not horizontal or vertical blanking), you get rainbow artifacts in the visible area. Turning on monochrome in PPUMASK while rewriting the palette, as seen in the "Sayoonara" demo by Chris Covell, helps minimize their visibility. But if you just want to display subtitle text in a bottom status bar, something simple like loading eight colors' worth of black-and-white palettes can be done in a couple lines' horizontal blanking.
Quote:
Quote:
DATA is anything copied into RAM at reset time. You shouldn't have much need for this, as things would usually get copied at the start of a game, the start of a level, etc.
What would an example of such a thing be?
How well do you know C? It can sometimes be easier to explain ld65 concepts in terms of C, as that's what ld65 was designed around. Segment DATA contains what C would call the values of initialized statically allocated variables, be they global variables or local variables with 'static' storage. But
some coding standards discourage initialized declarations, instead preferring to initialize the variables explicitly from 'const' data at runtime. This 'const' data goes in RODATA.
Quote:
How well do you know C? It can sometimes be easier to explain ld65 concepts in terms of C, as that's what ld65 was designed around. Segment DATA contains what C would call the values of initialized statically allocated variables, be they global variables or local variables with 'static' storage. But some coding standards discourage initialized declarations, instead preferring to initialize the variables explicitly from 'const' data at runtime. This 'const' data goes in RODATA.
I know about static and const, but I don't understand what that gnu page is saying one shouldn't write. Is it saying don't write something like:
Code:
int f() {
static int x = 3;
...
x += whatever;
}
but instead?:
Code:
intf() {
static int x;
x = 3;
...
x += whatever;
}
(Yes, I know that these functions don't do the same thing. This only underlines that I don't understand what the gnu page is trying to say.)
The GNU page is supposed to say, as I interpret it:
- Don't use static variables inside functions.
- Initialize global variables at the start of main(), so that if you need to initialize them more than once, you can easily refactor that out. This covers the case of, say, returning to the title screen.
tepples wrote:
But if you just want to display subtitle text in a bottom status bar, something simple like loading eight colors' worth of black-and-white palettes can be done in a couple lines' horizontal blanking.
You'd actually have to change only the 2 middle colors of the palette, so it might be possible to avoid color artifacts altogether if you do the writes during HBlank.
Quote:
Don't use static variables inside functions.
Initialize global variables at the start of main(), so that if you need to initialize them more than once, you can easily refactor that out. This covers the case of, say, returning to the title screen.
Ah, that makes sense now.
-----
Here's a couple more questions:
1) I know that sprites can overlap something in the background, but can they underlap it?
2) Were there any mappers that added capabilities for the PPU to use more than just 4 palettes for the background and 4 palettes for sprites? or perhaps add more than 4 colors per palette? Could a mapper even do such a thing?
FinalZero wrote:
Quote:
Don't use static variables inside functions.
Initialize global variables at the start of main(), so that if you need to initialize them more than once, you can easily refactor that out. This covers the case of, say, returning to the title screen.
Ah, that makes sense now.
-----
Here's a couple more questions:
1) I know that sprites can overlap something in the background, but can they underlap it?
2) Were there any mappers that added capabilities for the PPU to use more than just 4 palettes for the background and 4 palettes for sprites? or perhaps add more than 4 colors per palette? Could a mapper even do such a thing?
1) Yes, they can, if you set bit 5 of sprite attribute.
76543210
||||||||
||||||++- Palette (4 to 7) of sprite
|||+++--- Unimplemented
||+------ Priority (0: in front of background; 1: behind background)
|+------- Flip sprite horizontally
+-------- Flip sprite vertically
(from
http://wiki.nesdev.com/w/index.php/OAM)
2) Were there any mappers that added capabilities for the PPU to use more than just 4 palettes for the background and 4 palettes for sprites? or perhaps add more than 4 colors per palette? Could a mapper even do such a thing?
I think there's no such mapper. I'm not sure but I don't think it could be done.
Quote:
1) Yes, they can, if you set bit 5 of sprite attribute.
Ah, I see. Thank you.
I asked because I had this in mind:
Notice how the lower part of the sprite is behind the wall, but upper part is in front of it. For a second I thought they were using a trick like removing the lower part of the sprite (since it's actually composed of 4 sprites, 2 upper and 2 lower) whenever a sprite's location is on a certain tile, but notice how the lower part of the sprite that's behind the transparent pixels of the wall is still visible.
As a side note, while looking at the GBC's specs, I was surprised to find out that it supports 8 palettes each for both the background and sprites (hence my other questions), and that it also has a faster processor (4 times as fast, iirc). I knew it came 10 years after the NES, but I didn't think it was so fast, because it was just a handheld...
The NES supports two kinds of sprite masking.
There's the simple masking where the sprite goes behind everything in the background that's not color #0. Lots of games use this, including Super Mario Bros when you go through a pipe.
Then there's another masking mode where you put a sprite behind the background, then subsequent sprites in the same place are not drawn. Super Mario 3 uses this 'mode' when you hit a question block and a mushroom comes out of it, it puts a dummy sprite behind the question block. The sprites of the mushroom are covered up by the pixels of the dummy sprite. Time Lord also uses this trick as well.
The masking sprite must be set to be behind the background, and must be earlier in the sprite table than the sprite which gets masked.
For example, Here's time lord, and the sprites it draws to pull off a masking effect:
And here's what the masking effects ends up looking like: (not quite the same screenshot, but close enough)
Also, for that GBC screenshot, they are using the feature where every background palette has its own background color (color #0). If you look closely, the background area of that block is all the same color. GBC doesn't do sprite masking the same way the NES does, it only supports the simple Color #0 mode.
The GBA can simulate the NES-style overlapped sprite masking with the GBA's transparency/blending feature, but that's only useful for making NES emulators on the GBA.
FinalZero wrote:
In addition to the masking modes that
Dwedit mentioned, the GBC supports background priority per tile, like the Sega Master System, Game Gear, Genesis, and Super NES. Each background has a second plane of tile attributes, and one bit of these attributes is "top priority". A tile can be placed in front of all sprites, even those with the priority bit off. It's often used for text boxes, or for the first tile of a wall. The NES does not have this; it has to use either A. plain backgrounds surrounding an object (SMB1) or B. the triple-overlap effect (SMB3) to simulate this.
Quote:
As a side note, while looking at the GBC's specs, I was surprised to find out that it supports 8 palettes each for both the background and sprites (hence my other questions), and that it also has a faster processor (4 times as fast, iirc).
Please be careful not to fall for the
megahertz myth, especially when comparing different CPUs with different instruction sets and different microarchitectures. As I understand it, the consensus from the C=64/Speccy wars is that an 8080-family processor such as that in a ZX Spectrum or Game Boy Color typically needs to be clocked twice as fast as an equivalent 6502. So one can treat the original GB's CPU as roughly the same speed as that of the NES, and the GBC CPU is only twice as fast.
Quote:
There's the simple masking where the sprite goes behind everything in the background that's not color #0. Lots of games use this, including Super Mario Bros when you go through a pipe.
So, this is the only way that the GBC can do it, right? (Other than what Tepples said.)
Quote:
Then there's another masking mode where you put a sprite behind the background, then subsequent sprites in the same place are not drawn. Super Mario 3 uses this 'mode' when you hit a question block and a mushroom comes out of it, it puts a dummy sprite behind the question block. The sprites of the mushroom are covered up by the pixels of the dummy sprite. Time Lord also uses this trick as well.
The masking sprite must be set to be behind the background, and must be earlier in the sprite table than the sprite which gets masked.
So, this is done to stop the mushroom from appearing on top of the block it comes out of, but to let it smoothly flow out of it at the same time, right?
Quote:
For example, Here's time lord, and the sprites it draws to pull off a masking effect:
I think I understand.
Quote:
Also, for that GBC screenshot, they are using the feature where every background palette has its own background color (color #0). If you look closely, the background area of that block is all the same color.
Indeed, I see it.
Quote:
Also, for that GBC screenshot, they are using the feature where every background palette has its own background color (color #0).
And that's the first type of masking your described for the NES, right?
Quote:
In addition to the masking modes that Dwedit mentioned, the GBC supports background priority per tile, like the Sega Master System, Game Gear, Genesis, and Super NES. Each background has a second plane of tile attributes, and one bit of these attributes is "top priority". A tile can be placed in front of all sprites, even those with the priority bit off. It's often used for text boxes, or for the first tile of a wall.
Can one put background tiles on top of other background tiles? That way, one could easily create text boxes by just laying a new layer of "top"-mode tiles into the background, and then remove them when the text box is finished, thus not needing to recalculate the tiles that are where the text box was.
Quote:
The NES does not have this; it has to use either A. plain backgrounds surrounding an object (SMB1) or B. the triple-overlap effect (SMB3) to simulate this.
I don't understand the two things your describing here.
Quote:
Please be careful not to fall for the megahertz myth, especially when comparing different CPUs with different instruction sets and different microarchitectures. As I understand it, the consensus from the C=64/Speccy wars is that an 8080-family processor such as that in a ZX Spectrum or Game Boy Color typically needs to be clocked twice as fast as an equivalent 6502. So one can treat the original GB's CPU as roughly the same speed as that of the NES, and the GBC CPU is only twice as fast.
I should've been more wary. Still, I didn't expect the GBC to be faster than the NES at all. I guess mentally I've equated the (graphics) capabilities GBC to that of the NES, and likewise for the GBA to the SNES, and DS to the N64.
Unlike the GBC PPU, the NES PPU has only one color #0. There is space for 28 colors in CGRAM, which would in theory allow a separate color #0 for each separate background palette, but the PPU only ever uses the first background palette's color #0 when rendering is on.
FinalZero wrote:
Quote:
In addition to the masking modes that Dwedit mentioned, the GBC supports background priority per tile.
Can one put background tiles on top of other background tiles?
This is possible only in a very limited way. On the GBC, one would use the "window", and on the NES, one would use mid-screen scrolling.
Quote:
That way, one could easily create text boxes by just laying a new layer of "top"-mode tiles into the background, and then remove them when the text box is finished, thus not needing to recalculate the tiles that are where the text box was.
Is it that hard to recalculate? You recalculate anyway when scrolling a background, unless you're doing a Zelda- or Battle Kid-style flip screen. And the GBC has a lot more work RAM than any licensed NES board, and more than any Famicom board short of SXROM, so you could probably just read the tiles out of VRAM, display your text box, and put them back.
Quote:
Quote:
The NES does not have this; it has to use either A. plain backgrounds surrounding an object (SMB1) or B. the triple-overlap effect (SMB3) to simulate this.
I don't understand the two things your describing here.
In the first
Super Mario Bros. game for the NES, power-ups sprouting from blocks were drawn with back priority sprites. The tiles above a ? block were always solid color 0, and no water levels ever had ? blocks. Mario going into a pipe was likewise drawn with back priority sprites; notice how he disappears immediately when walking into the pipe at the end of 2-2, 7-2, and the water section of 8-4.
Super Mario Bros. 3 for NES couldn't always use this trick because powerups had to sprout behind the ? block but in front of occasionally more detailed background pieces, such as the dots in the background of a cave or the water in the back of a water level. So it used a back priority sprite appearing in OAM before the powerup to force background pixels to be drawn in that position.
Quote:
I guess mentally I've equated the (graphics) capabilities GBC to that of the NES, and likewise for the GBA to the SNES
GBA with its 16.8 MHz ARM7TDMI CPU is comparable to a Super NES + Super FX. Compare Doom and Yoshi's Island on both platforms.
Quote:
Unlike the GBC PPU, the NES PPU has only one color #0. There is space for 28 colors in CGRAM,
I thought there was only 4 palettes for each the background and the sprites?
Quote:
which would in theory allow a separate color #0 for each separate background palette, but the PPU only ever uses the first background palette's color #0 when rendering is on.
Hmm, I didn't know that.
Quote:
Is it that hard to recalculate? You recalculate anyway when scrolling a background, unless you're doing a Zelda- or Battle Kid-style flip screen.
I suppose not.
Quote:
And the GBC has a lot more work RAM than any licensed NES board, and more than any Famicom board short of SXROM, so you could probably just read the tiles out of VRAM, display your text box, and put them back.
One would still do the same thing for the NES though, no?
Quote:
In the first Super Mario Bros. game for the NES, power-ups sprouting from blocks were drawn with back priority sprites. The tiles above a ? block were always solid color 0, and no water levels ever had ? blocks. Mario going into a pipe was likewise drawn with back priority sprites; notice how he disappears immediately when walking into the pipe at the end of 2-2, 7-2, and the water section of 8-4.
So, I played until the end of 2-2, and now I see what you were trying to say.
Quote:
Super Mario Bros. 3 for NES couldn't always use this trick because powerups had to sprout behind the ? block but in front of occasionally more detailed background pieces, such as the dots in the background of a cave or the water in the back of a water level. So it used a back priority sprite appearing in OAM before the powerup to force background pixels to be drawn in that position.
And I played until I found a block which didn't have a null background above it, so I understand tell the difference now.
Btw, in case anyone was wondering, the screenshot posted on the last page was from Dragon Warrior III, the remake for the GBC. It's interesting to compare it with the NES version, because the graphics were so vastly improved, yet I can't see any reason why the NES version couldn't have done most of the graphical techniques that the GBC version used. On that note, how many pattern tables can the GBC store? I assume at least 4, since it can hold twice as many palettes also?
FinalZero wrote:
I thought there was only 4 palettes for each the background and the sprites?
Yes, the NES has 4 background palettes and 4 sprite palettes. In theory, each palette has 4 colors, but there are some weird considerations about the first color (color 0) of each palette.
For the background, no matter what palette you use, if a pixel uses color 0 it gets drawn with color 0 of the first palette, even though has all 4 colors (1 for each palette) internally. There is a way to display those colors though, which is to disable rendering and make $2006 point to them. This is hardly useful though.
For sprite palettes, their first colors don't even exist, they are mirrors of the first colors in the background palettes. This means that the PPU has 4 * 4 (background) + 4 * 3 (sprites) = 28 colors internally, but because of the way background rendering works you only get to see 25 of them.
Quote:
One would still do the same thing for the NES though, no?
It depends. Personally, I'd rather decode the data from the map again than waste hundreds of bytes just to remember tiles, specially if working only with the built-in 2KB of RAM.
Quote:
On that note, how many pattern tables can the GBC store? I assume at least 4, since it can hold twice as many palettes also?
I'm not sure how much VRAM the GBC has for tiles, but I wouldn't make any assumptions based on the palette count, since there is no direct relation between them. You might even be right, but I doubt the palette count would have anything to do with this.
FinalZero wrote:
Quote:
Unlike the GBC PPU, the NES PPU has only one color #0. There is space for 28 colors in CGRAM,
I thought there was only 4 palettes for each the background and the sprites?
There are 28 colors in these 4 + 4 palettes:
Background palette 0: 0, 1, 2, 3
Background palette 1: 4, 5, 6, 7
Background palette 2: 8, 9, 10, 11
Background palette 3: 12, 13, 14, 15
Sprite palette 0: 17, 18, 19
Sprite palette 1: 21, 22, 23
Sprite palette 2: 25, 26, 27
Sprite palette 3: 29, 30, 31
Colors 16, 20, 24, and 28 do not have distinct memory cells in the PPU. (They're mirrors of 0, 4, 8, and 12 respectively.) An oversight in the PPU design causes colors 4, 8, and 12 to be replaced with color 0 when rendering is turned on.
Quote:
Quote:
And the GBC has a lot more work RAM than any licensed NES board, and more than any Famicom board short of SXROM, so you could probably just read the tiles out of VRAM, display your text box, and put them back.
One would still do the same thing for the NES though, no?
On the NES, it's a bit harder because reading from VRAM is unreliable if sample playback is turned on. If a sample fetch happens on a certain clock cycle of the readback (LDA $2007), the CPU sends two read requests to the PPU as the DMA unit grabs and releases control of the address bus, and it misses one of the results. This also affects reading the controller ($4016 and $4017), but there are well-known ways to work around that, such as reading the controller twice and using the previous frame's keypresses if the read key states don't match. GBC doesn't have this problem because it has no DMA-based sample playback channel; instead, its triangle channel's waveform is rewritable like the waveforms on the FDS or TG16.
Quote:
Btw, in case anyone was wondering, the screenshot posted on the last page was from Dragon Warrior III, the remake for the GBC. It's interesting to compare it with the NES version, because the graphics were so vastly improved, yet I can't see any reason why the NES version couldn't have done most of the graphical techniques that the GBC version used.
NES has no background tile flipping, unlike GBC. Tile flipping allows for certain CHR optimizations on trees, walls, grass, etc. GBC also has MMC5-style palette per tile instead of per block of 2x2 tiles. NES has less capacity for sprite overdraw: 8 sprites per line (25% overdraw on a 256px wide screen) vs. GB/GBC 10 sprites per line (50% overdraw on a 160px wide screen), and possibly for this reason, characters using Mega Man-style overlays for extra color appear to be more common on GBC than on NES because they're less likely to cause dropouts and flicker. Oh, and there are thousands of usable colors on the GBC (like on the Game Gear), unlike the NES where 52 colors in an HSV arrangement plus a screen-wide tint control are all you get.
Quote:
On that note, how many pattern tables can the GBC store? I assume at least 4, since it can hold twice as many palettes also?
The Game Boy has 6 KiB of VRAM devoted to one and a half pattern tables: 128 tiles just for sprites, 128 tiles to share between sprites and backgrounds, and 128 tiles just for backgrounds. The remaining 2 KiB of VRAM is used by two nametables in a single-screen mirroring configuration.
The Game Boy Color has two sets of one and a half pattern tables, and one bit of the attribute selects whether a tile or sprite uses the first or second table.
Quote:
Yes, the NES has 4 background palettes and 4 sprite palettes. In theory, each palette has 4 colors, but there are some weird considerations about the first color (color 0) of each palette.
And no mapper can increase the number of palettes, right?
Quote:
There are 28 colors in these 4 + 4 palettes:
Background palette 0: 0, 1, 2, 3
Background palette 1: 4, 5, 6, 7
Background palette 2: 8, 9, 10, 11
Background palette 3: 12, 13, 14, 15
Sprite palette 0: 17, 18, 19
Sprite palette 1: 21, 22, 23
Sprite palette 2: 25, 26, 27
Sprite palette 3: 29, 30, 31
I didn't know that the first color in the sprites (and later 3 background palettes) was ignored.
Quote:
Colors 16, 20, 24, and 28 do not have distinct memory cells in the PPU. (They're mirrors of 0, 4, 8, and 12 respectively.) An oversight in the PPU design causes colors 4, 8, and 12 to be replaced with color 0 when rendering is turned on.
So it wasn't intentional, but accidental?
Quote:
On the NES, it's a bit harder because reading from VRAM is unreliable if sample playback is turned on.
What does "sample playback" mean? Any sound or music?
Quote:
NES has less capacity for sprite overdraw: 8 sprites per line (25% overdraw on a 256px wide screen) vs. GB/GBC 10 sprites per line (50% overdraw on a 160px wide screen)
What do you mean by overdraw? Being able to draw on the part of the screen that isn't displayed?
Quote:
The Game Boy has 6 KiB of VRAM devoted to one and a half pattern tables: 128 tiles just for sprites, 128 tiles to share between sprites and backgrounds, and 128 tiles just for backgrounds. The remaining 2 KiB of VRAM is used by two nametables in a single-screen mirroring configuration.
By "tile", you mean an 8x8 block of pixels, right?
Quote:
NES has no background tile flipping, unlike GBC. Tile flipping allows for certain CHR optimizations on trees, walls, grass, etc. GBC also has MMC5-style palette per tile instead of per block of 2x2 tiles. NES has less capacity for sprite overdraw: 8 sprites per line (25% overdraw on a 256px wide screen) vs. GB/GBC 10 sprites per line (50% overdraw on a 160px wide screen), and possibly for this reason, characters using Mega Man-style overlays for extra color appear to be more common on GBC than on NES because they're less likely to cause dropouts and flicker. Oh, and there are thousands of usable colors on the GBC (like on the Game Gear), unlike the NES where 52 colors in an HSV arrangement plus a screen-wide tint control are all you get.
Still, I think they could've done better. They could've made overlapping walls and trees, trees that are sized properly, bushes that actually look like bushes (I played through the whole game (of the NES version) and didn't realize until looking at the GBC version again that those clumps of dark and light green on the ground were supposed to be bushes.), etc. Least they had learned something from the time they began DW1 and 2 though, because those were uglier yet.
Btw, with the screenshot I posted earlier in mind, I suppose that with each step the player takes, the game must calculate whether the lower part of a character should underlap the background or not, depending on the character's coordinates on the map?
FinalZero wrote:
And no mapper can increase the number of palettes, right?
No, they really can't. The palettes are stored inside the PPU, so the mappers can't interfere in the process of reading them like they can with pattern/name/attribute tables, which are stored outside of the PPU.
Quote:
I didn't know that the first color in the sprites (and later 3 background palettes) was ignored.
The first color of the sprite palettes doesn't even exist (there's no memory to store them)... It's not like you could do anything with them anyway, since color 0 means transparency for sprites.
Quote:
So it wasn't intentional, but accidental?
I believe this was intentional... I imagine that using the first color globally requires some extra logic, which they wouldn't have used if they didn't explicitly want the PPU to behave like that. Maybe the "oversight" tepples was talking about was failing to realize that the other way would be more versatile, because developers would be able to use different colors if they wanted to but they could also make them all the same.
Quote:
What does "sample playback" mean? Any sound or music?
The NES has 4 audio channels that play simple waveforms that sound like "blips and blops" (i.e. typical 8-bit era sounds) but it also has a PCM channel that can be used to play more complex sounds such as real instruments or the human voice. These sounds are called "sampled sounds" because you record them by sampling the sound wave at a constant frequency, and with that information the same sound (or an approximation of it) can be played back.
One way to play sampled sounds on the NES consists in pointing it to the DPCM data in the ROM that you want to play, and it will do so while the program keeps running. From time to time, the CPU will briefly interrupt the program to read bytes containing sampled sounds from the ROM, and these reads are the ones that can interfere with joypad and PPU reads.
Quote:
What do you mean by overdraw? Being able to draw on the part of the screen that isn't displayed?
In this case, overdraw means how much of the scanline can be filled with sprites. "Over" probably means "on top" here, meaning that the sprite can cover a certain amount of the background.
Quote:
By "tile", you mean an 8x8 block of pixels, right?
Yes. Which means that the GBC has 768 tiles, according to what tepples said.
FinalZero wrote:
Quote:
Yes, the NES has 4 background palettes and 4 sprite palettes.
And no mapper can increase the number of palettes, right?
Correct. But a mapper can make smaller background color areas (8x1 pixels instead of 16x16 pixels), and in theory, a mapper can even include a more sophisticated PPU that can fake more palettes through spatiotemporal dithering.
Quote:
Quote:
An oversight in the PPU design causes colors 4, 8, and 12 to be replaced with color 0 when rendering is turned on.
So it wasn't intentional, but accidental?
I'm almost certain that nobody posting to this board worked for Nintendo during 1983 when Nintendo was engineering the PPU. Therefore, we can't know for sure whether the inability to use colors 4, 8, and 12 was intentional or accidental.
Quote:
Quote:
reading from VRAM is unreliable if sample playback is turned on.
What does "sample playback" mean? Any sound or music?
Only
channel 5 plays back samples streamed from ROM. The Legend of Zelda and Super Mario Bros. 2: Mario Madness use this for sound effects that were produced with the FDS adapter's channel 6 on the respective FDS versions of these games. Super Mario Bros. 3, Contra, and Blades of Steel use this for drums; several later games by Sunsoft use it for the bass. The other four channels are
digital tone generators, do not automatically read data from ROM or RAM, and therefore do not interfere with reading from VRAM or the controllers.
Quote:
Quote:
NES has less capacity for sprite overdraw: 8 sprites per line (25% overdraw on a 256px wide screen) vs. GB/GBC 10 sprites per line (50% overdraw on a 160px wide screen)
What do you mean by overdraw? Being able to draw on the part of the screen that isn't displayed?
You're thinking of either
nametable arrangement or
overscan. Overdraw is something entirely different, related to the amount of space on the screen that sprites can cover. In 3D graphics, "overdraw" refers to the total area covered by overlapping triangles. Video processors descended from the TI TMS9918 family VDPs used in the TI-99/4A, ColecoVision, and MSX, such as those in the NES, SMS, Game Boy, Genesis, Super NES, and GBC, have limits on the number of sprites that can appear on the screen, the maximum size of sprites, the number of sprites that can appear on one scanline, and the total number of sprite pixels that can appear on one scanline. I have used the term "overdraw" to refer to the amount of area that can be covered by sprites without dropout, in most cases with respect to the amount of a scanline that can be covered with sprites.
Quote:
Quote:
The Game Boy has 6 KiB of VRAM devoted to one and a half pattern tables: 128 tiles just for sprites, 128 tiles to share between sprites and backgrounds, and 128 tiles just for backgrounds. The remaining 2 KiB of VRAM is used by two nametables in a single-screen mirroring configuration.
By "tile", you mean an 8x8 block of pixels, right?
Correct. Tiles on the vast majority of VDPs descended from the TMS9918 are 8x8 pixels in size. On the NES, Game Boy, and Game Boy Color, the data for each 8x8 pixel tile is always 16 bytes in size.
Quote:
They could've made overlapping walls and trees
Overlapping objects in the background require extra tiles and, if the objects are of different colors, often extra palettes. That's one reason why roofs of RPG houses often sloped with the flat part toward the camera (as seen
here), as opposed to toward the sides (as seen in the home button of your web browser): so that the edges could remain aligned to the tile grid.
Quote:
trees that are sized properly
Which would have required extra care to show the correct overlap when the player walks behind them. This is tricky. Even on the Super NES, with its two main background layers and background priority per tile, Sony didn't get player-background occlusion 100% right in Equinox.
Quote:
Correct. But a mapper can make smaller background color areas (8x1 pixels instead of 16x16 pixels), and in theory, a mapper can even include a more sophisticated PPU that can fake more palettes through spatiotemporal dithering.
I don't know what spatiotemporal dithering is, but it sounds like something that would be very unrealistic to implement. Did any licensed games use such a thing?
Quote:
Only channel 5 plays back samples streamed from ROM. The Legend of Zelda and Super Mario Bros. 2: Mario Madness use this for sound effects that were produced with the FDS adapter's channel 6 on the respective FDS versions of these games. Super Mario Bros. 3, Contra, and Blades of Steel use this for drums; several later games by Sunsoft use it for the bass. The other four channels are digital tone generators, do not automatically read data from ROM or RAM, and therefore do not interfere with reading from VRAM or the controllers.
Quote:
[what Tokumaru said] ...
I see.
Quote:
Overlapping objects in the background require extra tiles and, if the objects are of different colors, often extra palettes.
1) Why would it require extra tiles?
2) Ack, I forgot that palettes are distributing to 16 by 16 pixel blocks, not 8 by 8... Perhaps it wouldn't be possible then...
Quote:
The first color of the sprite palettes doesn't even exist (there's no memory to store them)... It's not like you could do anything with them anyway, since color 0 means transparency for sprites.
Oh. I thought one could turn off the transparency for sprites if one wanted to, thus allowing a fourth color.
FinalZero wrote:
Quote:
a mapper can even include a more sophisticated PPU that can fake more palettes through spatiotemporal dithering.
I don't know what spatiotemporal dithering is
Dithering means adding noise to spread out rounding errors. Spatial dithering spreads out rounding errors between neighboring pixels. Temporal dithering spreads out rounding errors from one frame to the next: each pixel is flickered. "Spatiotemporal" dithering spreads out rounding errors both to neighboring pixels and to the next frame. But a PPU-in-a-mapper is largely theoretical (apart from Wide Boy) and not something you need to worry about at this stage.
Quote:
Did any licensed games use such a thing?
Flickering the least significant bit of palette values is a form of temporal dithering that appears to have been common in Genesis games in order to work around the limitation of 3 bits per channel in the palette, compared to 5 bits on the Super NES, so that gradients don't royally suck.
Quote:
Quote:
Overlapping objects in the background require extra tiles and, if the objects are of different colors, often extra palettes.
1) Why would it require extra tiles?
Because when you overlap two objects in the background, and they don't have clean edges on tile boundaries, you have to composite all combinations of the two objects into CHR RAM or include all combinations in the CHR ROM.
Quote:
Oh. I thought one could turn off the transparency for sprites if one wanted to, thus allowing a fourth color.
This is possible in SDL, but not in NES.
FinalZero wrote:
Did any licensed games use such a thing?
I don't think so, it's too much trouble.
This set of demos seems to be a good example of the technique, though.
Quote:
Dithering means adding noise to spread out rounding errors. Spatial dithering spreads out rounding errors between neighboring pixels. Temporal dithering spreads out rounding errors from one frame to the next: each pixel is flickered. "Spatiotemporal" dithering spreads out rounding errors both to neighboring pixels and to the next frame. But a PPU-in-a-mapper is largely theoretical (apart from Wide Boy) and not something you need to worry about at this stage.
Ok.
Quote:
Flickering the least significant bit of palette values is a form of temporal dithering that appears to have been common in Genesis games in order to work around the limitation of 3 bits per channel in the palette, compared to 5 bits on the Super NES, so that gradients don't royally suck.
Oh, the Super NES had 32 colors per palette? And 65k-some colors to choose from I assume?
Quote:
Because when you overlap two objects in the background, and they don't have clean edges on tile boundaries, you have to composite all combinations of the two objects into CHR RAM or include all combinations in the CHR ROM.
So, I was right when I said this, right?:
Quote:
2) Ack, I forgot that palettes are distributing to 16 by 16 pixel blocks, not 8 by 8... Perhaps it wouldn't be possible then...
Could overlap like in the screenshot I posted before be possible on the NES? Or would one need more colors per palette?
FinalZero wrote:
Oh, the Super NES had 32 colors per palette? And 65k-some colors to choose from I assume?
32 possible intensities for R G or B in a color, not 32 colors per palette. Compare with VGA's range of 0-63 for intensity, or later video card's 0-255 range.
I believe there are 16 16-color palettes on the SNES.
FinalZero wrote:
Could overlap like in the screenshot I posted before be possible on the NES? Or would one need more colors per palette?
You mean how the wall overlaps the characters? The best way to do that on the NES would be to use mask sprites... sprites with the same shape as the wall, with higher priority (lower OAM index) than the character sprites and with the "behind background" bit set.
When your character sprites and the mask sprites overlap, the PPU will first check which sprite has higher priority, and since the mask sprites appear first in the OAM they will win. Then, the PPU will try to draw the mask, but since it's configured to be behind the background, the background is displayed instead.
The only problem with this technique is that the sprite count raises quite quickly. You obviously wouldn't keep the masks in place at all times, only when necessary, but even then the limit of 8 sprites is reached fairly easily, so there will be some flickering.
Quote:
You mean how the wall overlaps the characters? The best way to do that on the NES would be to use mask sprites... sprites with the same shape as the wall, with higher priority (lower OAM index) than the character sprites and with the "behind background" bit set.
When your character sprites and the mask sprites overlap, the PPU will first check which sprite has higher priority, and since the mask sprites appear first in the OAM they will win. Then, the PPU will try to draw the mask, but since it's configured to be behind the background, the background is displayed instead.
The only problem with this technique is that the sprite count raises quite quickly. You obviously wouldn't keep the masks in place at all times, only when necessary, but even then the limit of 8 sprites is reached fairly easily, so there will be some flickering.
I understand. That method wouldn't require having more tiles either, since the top of the top of the character and the bottom of the wall would already be present. But can't colors only be distributed per 16x16 blocks? That'd mean either the character or the wall would have to change its palette to the other's, which would look bad. I wouldn't mind the sprites per line limit so much, since one can tell fceux to ignore that anyways. What's the limit on sprites per screen though?
FinalZero wrote:
I understand. That method wouldn't require having more tiles either, since the top of the top of the character and the bottom of the wall would already be present.
There's no need for any extra sprites for the player, but the mask must have the same shape of the wall. If the wall tiles themselves can't be used as a mask (it will happen depending on your use of color 0), you will need extra graphics.
Quote:
But can't colors only be distributed per 16x16 blocks? That'd mean either the character or the wall would have to change its palette to the other's, which would look bad.
This is irrelevant. The background will be drawn as normal. The player will use its normal palette(s) also, and it doesn't matter what palette is used for the mask, as it will be hidden behind the background and will not be visible at all.
Maybe you didn't understand how this works exactly, so I'll try to explain a little better. Here's the background and sprite in place:
The player is in front of the background, which you don't want. The easiest solution, if that grass is drawn exclusively with color 0, is to set the "behind the background" bit of the character's sprites, so he'll show up in front of the grass but behind the wall. That's hardly the case though, because grass often isn't flat, and has pixels of other colors on it. Also, you might want to apply the effect in some other areas where the floor is not green at all. The solution that works for all of these cases, is a mask:
The black sprite is a mask (I only used black for the example, the actual color doesn't matter, it will not be seen), with the exact shape of the wall (this is important!). It is successfully hiding the character, because it has higher priority, so it's drawn on top. The only thing left to do is set the "behind" bit of the mask's sprites, which will cause the PPU to bring the background pixels to the front, achieving the desired result:
I hope you can see that there's no issue with palettes whatsoever. The only thing that makes this technique complex is that you will probably need some logic to detect when the player is "touching" the wall in order to put the mask(s) in place only when necessary, to minimize sprite use.
Quote:
I wouldn't mind the sprites per line limit so much, since one can tell fceux to ignore that anyways.
But then you are not making NES programs anymore, but FCEUX programs, don't you agree? If you are going to deliberately ignore a system limitation because an inaccurate copy of it doesn't have the same limitation you could just as well throw all the limitations out the window and making a PC game instead. save yourself the trouble of learning how to code for an ancient machine.
Quote:
What's the limit on sprites per screen though?
64 sprites per screen, because that's how many entries fit in OAM. You can have more if you interrupt rendering mid-frame for 5 scanlines or so to perform more sprite DMAs. Few games ever did that, because unless you have a reason to split your screen (like, there are 2 gameplay windows) you wouldn't want blank scanlines in the middle of the screen.
Quote:
[tokumaru's text]
Oh, you're describing the same thing that SMB3 does for pipes. I'm not sure why I wasn't able to realize that one could use that for walls too...
Quote:
But then you are not making NES programs anymore, but FCEUX programs, don't you agree? If you are going to deliberately ignore a system limitation because an inaccurate copy of it doesn't have the same limitation you could just as well throw all the limitations out the window and making a PC game instead. save yourself the trouble of learning how to code for an ancient machine.
I suppose. Thinking about it further, I realize it wouldn't actually add any more sprites per line though, because the PPU would never get to drawing the one's with the lower priority.
Quote:
64 sprites per screen, because that's how many entries fit in OAM. You can have more if you interrupt rendering mid-frame for 5 scanlines or so to perform more sprite DMAs. Few games ever did that, because unless you have a reason to split your screen (like, there are 2 gameplay windows) you wouldn't want blank scanlines in the middle of the screen.
I see.
-----
Another question:
1) DW3 and DW4 had a day/night cycle, and the people available to talk/interact with, and sometimes even the town itself, changed according to the time. How is this done? I assume every time one enters a time, the time is checked, and then the appropriate people (and there sprites) are added? But what about the town itself? Must a whole different map be stored in rom?
The basics of day/night can be done by screwing with the palette and/or the tint bits. Look at the difference in SMB1 between 1-1 and 3-1 for a crude idea of how to do this.
Zelda: ALTTP had a light and dark world with mostly the same outdoor map, and there was some sort of differential coding: objects in both worlds, objects in light world only, and objects in dark world only.
FinalZero wrote:
1) DW3 and DW4 had a day/night cycle, and the people available to talk/interact with, and sometimes even the town itself, changed according to the time. How is this done?
Once the time is entered, you have the responsibility of updating it. If you have NMIs enabled, like you should, you can update the time 60 times per second (NTSC).
Quote:
I assume every time one enters a time, the time is checked, and then the appropriate people (and there sprites) are added? But what about the town itself? Must a whole different map be stored in rom?
You should check the time every frame, to decide whether it's time to change to day or night. Usually, modifying just the palette is enough to switch between day and night, but if you really need more extensive changes, nothing prevents you from changing the pattern tables or even the map.
The problem with basing a game around a day/night cycle is that no NES mapper has a real-time clock. So when you save the game, turn off the power, and turn the power back on, the program has no idea how much time elapsed while the power was off. It's one of the reasons why the
Animal Crossing concept never occurred to Nintendo developers during the NES era. (
Here are the others.)
Quote:
The basics of day/night can be done by screwing with the palette and/or the tint bits. Look at the difference in SMB1 between 1-1 and 3-1 for a crude idea of how to do this.
Zelda: ALTTP had a light and dark world with mostly the same outdoor map, and there was some sort of differential coding: objects in both worlds, objects in light world only, and objects in dark world only.
I've already watched the PPU of DW3 when it changes time (I left the PPU window open while playing through most of the game, actually.), and have seen how palette changes do it.
Quote:
The problem with basing a game around a day/night cycle is that no NES mapper has a real-time clock. So when you save the game, turn off the power, and turn the power back on, the program has no idea how much time elapsed while the power was off. It's one of the reasons why the Animal Crossing concept never occurred to Nintendo developers during the NES era. (Here are the others.)
It's not a realtime clock though; Instead, taking a step advances the time (maybe; sometimes it doesn't). Standing in place doesn't. I think it works well enough for a game like DW3.
Quote:
Zelda: ALTTP had a light and dark world with mostly the same outdoor map, and there was some sort of differential coding: objects in both worlds, objects in light world only, and objects in dark world only.
But did they store 2 different versions of the map? Or just the original and where the second deviated from it?
So, I've mustered the strength and moral courage to try and do this again. I have some free time before university begins again in the fall.
* * *
So, here's a new round of questions:
1) The registers are only a byte large, right? How does one do/simulate math with 2 or more bytes?
2) What happens if you use JSR when the stack's empty?
3) What happens if you use JSR when the stack's already full?
4) What's the point of SEI and CLI? Are they used when games are paused and unpaused? How does BRK fit into this?
* * *
I'll post the code I'm trying to get to work soon.
FinalZero wrote:
1) The registers are only a byte large, right? How does one do/simulate math with 2 or more bytes?
With the carry flag.
Code:
clc
lda a_lo
adc b_lo
sta result_lo
lda a_hi
adc b_hi
sta result_hi
Quote:
2) What happens if you use JSR when the stack's empty?
The return address (minus one) is pushed, and the new address is loaded into the program counter.
Quote:
3) What happens if you use JSR when the stack's already full?
The same, except that the stack pointer wraps around within the $0100-$01FF page.
Quote:
4) What's the point of SEI and CLI? Are they used when games are paused and unpaused?
Pause is a game state controlled by one of the game's variables. Each frame, the game moves the game pieces only if the game is not paused. It isn't necessarily related to anything in the hardware. SEI and CLI are used to change the CPU's interrupt priority level (CLI: IRQ and NMI; SEI: NMI only). This sort of interrupt is more connected with mappers.
Quote:
How does BRK fit into this?
BRK always performs a syscall to the IRQ vector, regardless of the CPU's interrupt priority level.
Quote:
The return address (minus one) is pushed, and the new address is loaded into the program counter.
I made a mistake. I meant RTS, not JSR...
Quote:
The same, except that the stack pointer wraps around within the $0100-$01FF page.
That sounds bad.
* * *
Thank you for you timely response.
FinalZero wrote:
I meant RTS, not JSR...
RTS on an empty stack pulls 16 bits of undefined data from the stack pointer (which again wraps around), adds one, and jumps there. I guess they didn't have enough transistors back in the late 1970s when the 6502 was designed to raise an interrupt on an attempt to wrap the stack.
A couple more questions, this time about cc65.
1)
Code:
#pragma charmap('\0', 0xFF)
is illegal, so C doesn't barf. I suppose there's no way to tell C to terminate strings with something other than '\0' (0x00), is there? I realize that it isn't that big of an issue anyways; Just move your font around so 0x00 is null, instead of something else.
2) Why is
Code:
#pragma charmap('0', 0xF0)
illegal though? Why does it think '0' is 0x00? Isn't it 0x30?
3) Is there a way to include a ca65 file in cc65 code? How about a binary?
cc65 itself generates an assembly file, so there is no problem to add an assembly file. Just assemble it and then give the linker object file. You can include binary files into an assembly file as usual.
I realized the problem with 2) in my previous post. My code actually had:
Code:
#pragma charmap('0', 0x00)
It was complaining about the 0x00. Apparently, one can't map anything from NOR to 0.
Quote:
cc65 itself generates an assembly file, so there is no problem to add an assembly file. Just assemble it and then give the linker object file. You can include binary files into an assembly file as usual.
But what if I have code that depends on the assembly file? I mean, I have an asm file that has ".include "something.s"", and it uses variables/locations declared/defined there.
I don't understand your problem from this explaination. There is .import and .export to pass symbols between modules.
Say I have the following assembly file:
Code:
mainLoop:
jsr readGamepadA
jsr handleInput
jsr vBlankWait
jsr drawSprites
jmp mainLoop
How will it assemble if it doesn't know what any of those identifiers are?
Your probably right though. I don't know how import and export work, and my assembly is still shaky.
If those aren't pointers to ROM, then it should throw an error. The first pass it pretty much assembles all the code and data and the 2nd pass it fills in the labels with the info, in a nutshell, I believe....
Quote:
If those aren't pointers to ROM, then it should throw an error. The first pass it pretty much assembles all the code and data and the 2nd pass it fills in the labels with the info, in a nutshell, I believe....
They're pointers/addresses, but they're defined/declared in the included assembly file.
Yes, they point to places in the ROM for data and such. If they don't exist, it won't output a file.
Quote:
Yes, they point to places in the ROM for data and such. If they don't exist, it won't output a file.
In assembly I'd just .include the file. How do I include the file in C? I can't just use "#include "something.s"", because it'll think that it's a C file, when it's actually assembly.
FinalZero wrote:
Quote:
Yes, they point to places in the ROM for data and such. If they don't exist, it won't output a file.
In assembly I'd just .include the file. How do I include the file in C? I can't just use "#include "something.s"", because it'll think that it's a C file, when it's actually assembly.
In NESASM3 and most others, you .include. But in C, I believe you have to make header files and then link them to another .exe with the functions and include those with quotes? I am not sure, hopefully someone who knows a little more about C than me will help.
Can you provide better example of your problem? You can include assembly files into assembly files, you can pass symbols between them, you can declare extern vars in C.
The most common example is to have binary included in an assembly file:
Code:
.export _my_data
_my_data:
.incbin "data.bin"
Now you can access it from C file if declare it there as an extern:
Code:
extern const unsigned char my_data[];
Note that labels need _ prefix in an assembly file, but does not need it in a C file.
Quote:
Now you can access it from C file if declare it there as an extern:
Okay. I think I get it now. include the header which has
Code:
extern const unsigned char my_data[];
and then assemble the .s file which has the actual code, and then link the .o from it's output. Right?
I get an error on the following line of code:
Code:
for(char i = 0; i < limit; i++)
The error is simply: "expression expected".
The whole function is:
Code:
for(char i = 0; i < limit; i++) {
lda gamepadARaw // Bit 0 of Raw == Current Button Status
shr // Bit 0 of Raw -> Carry
rol gamepadA // Bit 0 of Storage <- Carry
}
Ignoring that the assembly lines aren't in asm() statements, is there a better way of doing this? I mean, is it better to just leave it in a .repeat block in assembly? That takes more space though, doesn't it? Does the compiler optimize away things like that?
You can't just use assembly code in the middle of C code. Also, signed types are slow, avoid to use them; and local vars are slow to, so it is better to have global vars for common things like loop counters.
Compiler does not optimize things like this. Also, why you need this code to be fast? It is called like once per frame, you can do it in pure C and it will not have any real impact on perfomance.
If you want, you can make this (or any other) function in assembly, and just call it from C code.
Quote:
Compiler does not optimize things like this.
Okay
Quote:
Also, why you need this code to be fast? It is called like once per frame, you can do it in pure C and it will not have any real impact on perfomance.
I guess I don't know if I do or not.
Quote:
If you want, you can make this (or any other) function in assembly, and just call it from C code.
What's the proper way to do this?
I have a .h file with:
Code:
#define gamepadARaw 0x4016
byte gamepadA, gamepadAOld;
extern void readGamepadA(); // Do i need extern here?
What does the .s file look like? Like this?:
Code:
; Reads the first gamepad.
.proc ReadGamepadA
; Strobes the gamepad.
ldsta 1, GamepadARaw ; First
ldsta 0, GamepadARaw ; Second
; Stores the old value of the gamepad.
ldsta GamepadA, GamepadAOld
.repeat 8
lda GamepadARaw ; Bit 0 == Current Button Status
shr ; Bit 0 of Raw -> Carry
rol GamepadA ; Bit 0 of Storage <- Carry
.endrepeat
rts
.endproc
I think it is a good idea to
read docs about things like this.
For function like gamepad read, which does on have input params and has an output param, it could look like this in assembly:
Code:
.export _pad_check
_pad_check:
... your code put result in A
rts
In C you need to have a prototype for the function:
Code:
unsigned char __fastcall__ pad_check(void);
Okay, I'll try that.
I didn't know there was a cc65 wiki. Thanks for the link.
* * *
Also, I use "byte" as a typedef for "unsigned char" (and "ushort" for "unsigned short", etc). The latter is just too long to type.
* * *
Btw, is there a irc channel for this board?
Code:
.repeat 8
lda GamepadARaw ; Bit 0 == Current Button Status
shr ; Bit 0 of Raw -> Carry
rol GamepadA ; Bit 0 of Storage <- Carry
.endrepeat
How would I even do this part in C?
Shiru wrote:
I think it is a good idea to
read docs about things like this.
For function like gamepad read, which does on have input params and has an output param, it could look like this in assembly:
Code:
.export _pad_check
_pad_check:
... your code put result in A
rts
In C you need to have a prototype for the function:
Code:
unsigned char __fastcall__ pad_check(void);
I see that the assembly version names have leading underscores. I can't find the documentation, but I was just reading that one shouldn't do that, and there was some other way to do it. Do you know what I'm talking about?
Also, are you sure that you need to export from the asm file? I get an error: "Symbol '_readGamepadA' is already an import" when I try to export it.
You want me to rewrite your assembly code into C or you just have a problem with something certain? If you don't know how to read/write a memory location from C, just use a pointer, like:
Code:
unsigned char i=*(unsigned char*)0x4016;//read
*(unsigned char*)0x4016=0;//write
I don't know about 'some other way' and don't see any problem with _, it works fine.
I don't know structure of your project, in my project .export is needed.
Quote:
You want me to rewrite your assembly code into C or you just have a problem with something certain? If you don't know how to read/write a memory location from C, just use a pointer, like:
No, I already know how to do that. I meant the loading, shifting, and rolling. C doesn't have a rolling operator for some reason. (I think it's because of portability issues iirc.)
Quote:
I don't know about 'some other way' and don't see any problem with _, it works fine.
He said it was because it might not work in future versions. He might change how names are mangled.
Quote:
I don't know structure of your project, in my project .export is needed.
I'm looking at your source code for Alter Ego, and it is helpful.
Okay, Gamepad.h has 3 things in it:
Code:
#define gamepadARaw 0x4016
byte gamepadA, gamepadAOld;
void fastcall readGamepadA(void);
Main.c includes Gamepad.h. Gamepad.s includes Main.s, because it needs to use gamepadARaw, gamepadA, and gamepadAOld. So while it's exporting _read, it's also importing it from Gamepad.h through Main.c through Main.s. Anyways, I just put the variables/memory locations in question in Gamepad.s, since Main.c doesn't need them.
* * *
Also, I've discovered that leaving out void in a function definition's parameter list means it's variadic. *sigh*
Loading - reading a pointer. Shift << or >>. ROL just involves a bit more logic. In this exact case you don't even need ROL. Like,
Code:
i=*(unsigned char*)GamedapARaw;
GamepadA=(GamepadA<<1)|(i&1);
Thanks.
So, my latest error is a linker one: "Error: Input file 'nes.lib' not found". However, I'm confused because I've already set CC65_HOME.
Anyways, cheating and just putting nes.lib in the same directory, it's now complaining about the header segment overflowing, though I don't know why...
FinalZero wrote:
Also, I use "byte" as a typedef for "unsigned char" (and "ushort" for "unsigned short", etc). The latter is just too long to type.
I've seen u8, s8, u16, s16, u32, s32 used in the GBAdev community. Once GCC began to support C99 stdint.h, these became defined in terms of uint8_t and friends.
FinalZero wrote:
I meant the loading, shifting, and rolling. C doesn't have a rolling operator for some reason. (I think it's because of portability issues iirc.)
This (untested) exercises parts of my brain that I haven't used since the GBA was popular:
Code:
#define P1 (*(volatile unsigned char *)0x4016)
#define P2 (*(volatile unsigned char *)0x4017)
uint8_t read_pad1(void) {
uint8_t out = 0;
P1 = 1;
P1 = 0;
for (uint8_t x = 8; x > 0; --x) {
uint8_t newBit = (P1 & 0x03) ? 1 : 0;
out = (out << 1) | newBit;
}
return out;
}
If the header segment is overflowing, perhaps someone or something forgot to switch to the code segment.
Quote:
these became defined in terms of uint8_t and friends.
Ick, ugly. I don't see anything wrong with how C# decided to name it's primitives.
Quote:
I thought volatile didn't do anything in cc65.
Quote:
If the header segment is overflowing, perhaps someone or something forgot to switch to the code segment.
No, I checked that right away. I've since discovered why, after looking at the map file. nes.lib apparently has a header segment and places some code there. 'Twas simply a name clash.
* * *
I still can't link though. "Unresolved external __BSS_RUN__ in: zerobss.s" and 10 more similar to it.
Another question: What size does cc65 make enums? Please don't tell me they're int's by default. Is there a way to specify the size? This is long-overdue imo; at least they're adding it to C++0x. Hopefully C starts supporting it too.
Shiru wrote:
Also, signed types are slow, avoid to use them;
True, this is why CC65 char defaults to unsigned. That is, char == unsigned char.
Quote:
and local vars are slow to, so it is better to have global vars for common things like loop counters.
True, unless the "make local variables static" compiler switch is used. Still it's a good idea to have some global general use variables to avoid wasting memory.
FinalZero wrote:
I see that the assembly version names have leading underscores. I can't find the documentation, but I was just reading that one shouldn't do that, and there was some other way to do it. Do you know what I'm talking about?
Read where? I don't think there's any other way to do it.
Quote:
Also, I've discovered that leaving out void in a function definition's parameter list means it's variadic. *sigh*
That's normal C behavior.
Quote:
I thought volatile didn't do anything in cc65.
It doesn't yet, but might be changed in the future.
FinalZero wrote:
Another question: What size does cc65 make enums?
I think they're int, but it doesn't really matter a whole lot. You can assign enumerated values to unsigned chars.
One thing I don't understand is why they use indirect adressing for the C stack.
If only they would optionally allow to limit the stack to 256-bytes (or less) and acess it with indexed adressing. It would save 2 cycles on every read and 1 cycle on every write.
Quote:
That's normal C behavior.
So now I know. I do most of my programming in C++, and it seems that I'm always ambushed by something that works differently in C.
Quote:
Read where? I don't think there's any other way to do it.
It was either in the mailing correspence or on the website.
Bregalad wrote:
One thing I don't understand is why they use indirect adressing for the C stack.
Because standard practice in C, especially in existing programs originally developed for platforms other than the NES, is to use a stack far bigger than 256 bytes.
Anyways, can I get help with my linker error? I can't even find a file named "zerobss.s", "copydata.s", "condes.s", or "crt0.s".
* * *
Edit: I found what I was talking about earlier. It seems he was talking about inline assembly only.
http://www.cc65.org/doc/cc65-9.html wrote:
Note: Do not embed the assembler labels that are used as names of global variables or functions into your asm statements. Code like this
Code:
int foo;
int bar () { return 1; }
__asm__ ("lda _foo"); /* DON'T DO THAT! */
...
__asm__ ("jsr _bar"); /* DON'T DO THAT EITHER! */
may stop working if the way, the compiler generates these names is changed in a future version. Instead use the format specifiers from the table above:
Code:
__asm__ ("lda %v", foo); /* OK */
...
__asm__ ("jsr %v", bar); /* OK */
Quote:
Because standard practice in C, especially in existing programs originally developed for platforms other than the NES, is to use a stack far bigger than 256 bytes.
Doesn't the NES already have a stack at $0100–$01FF ? Why does C use it's stack for? Is it just stored in normal memory, or someplace special?
* * *
I was looking at:
http://en.wikibooks.org/wiki/NES_Programming
Can someone explain $4020–$5FFF in more depth? Were there ever any roms that were had more than $FFFF space?
Doesn't the size of the $6000–$7FFF region depend on the mapper/cart?
Is $8000–$FFFF the space the game code actually resides? Is that were the linker has to be directed to put things?
Do these things apply to all mappers? or just a certain one?
What kind of things are typically stored in the zeropage segment/range?
* * *
Also, addressing modes in ca65. Is the following right?:
Code:
LDA $00F8 ; Absolute
LDA $F8 ; Zero Page
It'll detect whether a byte or 2 bytes are given to it?
* * *
Also also, what do "prg" and "chr" stand for exactly? "program" and "character"?
FinalZero wrote:
Doesn't the NES already have a stack at $0100–$01FF ? Why does C use it's stack for? Is it just stored in normal memory, or someplace special?
Yes, it's stored in normal RAM. The reason is, like tepples said, that for many applications the default 256 byte stack of 6502 is not big enough.
Quote:
Can someone explain $4020–$5FFF in more depth? Were there ever any roms that were had more than $FFFF space?
Cart can theoretically map anything it wants at 4020-5FFF (RAM/ROM/your momma), but most commercial carts don't put anything in there.
Quote:
Doesn't the size of the $6000–$7FFF region depend on the mapper/cart?
I don't understand the question. Size of that region is always 8K. Again, THEORETICALLY, a cart can have bank switching features that allows to switch RAM/ROM/your momma in that region. For example, FME-7 allows to map ROM in that area as well. For the most part, that area is used for 8K RAM expansion, that's included in the cart.
Quote:
Is $8000–$FFFF the space the game code actually resides? Is that were the linker has to be directed to put things?
Yes, most of the time the code runs in that area of the memory.
Quote:
What kind of things are typically stored in the zeropage segment/range?
Pointers, mostly (to use with the (ind),y addressing, e.g. LDA (foo),y).
Quote:
Also, addressing modes in ca65. Is the following right?:
Code:
LDA $00F8 ; Absolute
LDA $F8 ; Zero Page
It'll detect whether a byte or 2 bytes are given to it?
Nope, you have to use LDA a:$F8 to tell the address is absolute.
Quote:
Also also, what do "prg" and "chr" stand for exactly? "program" and "character"?
Yup.
Quote:
I don't understand the question. Size of that region is always 8K. Again, THEORETICALLY, a cart can have bank switching features that allows to switch RAM/ROM/your momma in that region. For example, FME-7 allows to map ROM in that area as well. For the most part, that area is used for 8K RAM expansion, that's included in the cart.
I'm struggling to tell the difference between the mapper (where each has its own set of memory-mapped registers) and the cart itself. Also, I don't understand what's meant by "bankswitching".
Quote:
Nope, you have to use LDA a:$F8 to tell the address is absolute.
Okay.
* * *
I was reading:
http://wiki.nesdev.com/w/index.php/Standard_controllerand came across:
Quote:
A Super NES controller can be wired to the NES controller port, and it returns button status in a similar order: B, Y, Select, Start, Up, Down, Left, Right, A, X, L, R.
How would one wire a SNES controller to the NES? Do any emulators even support such a thing?
FinalZero wrote:
I'm struggling to tell the difference between the mapper (where each has its own set of memory-mapped registers) and the cart itself.
A mapper is an integrated circuit on the cart's printed circuit board that takes some of the signals coming from the CPU and PPU over the cart edge and generates signals used by other chips. Most of these signals are related to bank switching; some may be related to raster effects.
Quote:
Also, I don't understand what's meant by "bankswitching".
The NES doesn't have a big enough address bus to see all of the ROM at once. So instead, the CPU writes a "page number" of sorts to various I/O ports on the mapper to tell it which part of the cartridge to read. Then the mapper sends the "page number" to the ROM chips. Please read
this Wikipedia article and then let us know what you still don't understand.
Glossary hereQuote:
I was reading:
http://wiki.nesdev.com/w/index.php/Standard_controllerand came across:
Quote:
A Super NES controller can be wired to the NES controller port, and it returns button status in a similar order: B, Y, Select, Start, Up, Down, Left, Right, A, X, L, R.
How would one wire a SNES controller to the NES?
Buy an extension cord for Super NES controllers and an extension cord for NES controllers. Cut both in the middle. Match up power, ground, clock, latch, and data wires and solder them. It's even possible to wire a Game Boy or Game Boy Color to act as an NES controller because both NES controllers and Game Boy Game Link are very similar to
SPI bus, but that's not very useful without a Game Boy flash cart.
Quote:
Do any emulators even support such a thing?
Any emulator supporting the Four Score adapter can be used with NES games supporting the Super NES controller.
Player 1 A: SNES B
Player 1 B: SNES Y
Player 1 Select, Start, Up, Down, Left, Right: SNES same
Player 3 A: SNES A
Player 3 B: SNES X
Player 3 Select: SNES L
Player 3 Start: SNES R
Quote:
The NES doesn't have a big enough address bus to see all of the ROM at once.
How big is its address bus? Isn't it two bytes? Or is it half that?, since the PRG only goes from 0x8000 to 0xFFFF.
Quote:
So instead, the CPU writes a "page number" of sorts to various I/O ports on the mapper to tell it which part of the cartridge to read. Then the mapper sends the "page number" to the ROM chips. Please read this Wikipedia article and then let us know what you still don't understand.
What part of the addressing does bank switching switch? Everything? or only a certain range? Is there a limit on the number of banks a cart could hold? What was the largest NES rom ever sold?
Bits A14-A0 of the CPU address bus are on the cart edge. A15 isn't directly made available, but it can be inferred from A14, Phi2, and PRG /CE.
Bankswitching can affect any address in $4020-$FFFF but most commonly affects $8000-BFFF, $8000-$DFFF, or $8000-$FFFF depending on the mapper. It also commonly affects PPU $0000-$1FFF, and nametable mirroring control can be thought of as a form of bankswitching of PPU $2000-$2FFF.
The number and size of available banks depend entirely on the capability of the mapper, which varies from mapper to mapper.
Largest game:
this page
Quote:
Bits A14-A0 of the CPU address bus are on the cart edge. A15 isn't directly made available, but it can be inferred from A14, Phi2, and PRG /CE.
Bankswitching can affect any address in $4020-$FFFF but most commonly affects $8000-BFFF, $8000-$DFFF, or $8000-$FFFF depending on the mapper. It also commonly affects PPU $0000-$1FFF, and nametable mirroring control can be thought of as a form of bankswitching of PPU $2000-$2FFF.
The number and size of available banks depend entirely on the capability of the mapper, which varies from mapper to mapper.
Largest game: this page
Okay, so a ushort (2 bytes) lets one address 64 kB. Usually only half of that can be bankswitched, which is 32 kB. So anything larger than 32 kB must use bankswitching. 512 kB seems absolutely gigantic then.
A couple new questions:
1) How does one set segments/banks (including the .nes header) with cc65?
2) What happens if the .nes format header is present on a real rom and played in a real NES?
Number 2 is all data will be offset by 16 bytes. And interrupt vectors won't point to right places and it won't run....err, shouldn't run.
But that might be one way to freak out people who dump your games: include a copy of the iNES header at the start, but make it subtly wrong. When they hex edit the resulting dump, they wonder why the header is in there twice and think there might be a problem with the dumper.
Why does this remind me of how Super Mario Bros 2 contains the word "ZELDA" at the end of the PRG data?
Quote:
Any emulator supporting the Four Score adapter can be used with NES games supporting the Super NES controller.
Player 1 A: SNES B
Player 1 B: SNES Y
Player 1 Select, Start, Up, Down, Left, Right: SNES same
Player 3 A: SNES A
Player 3 B: SNES X
Player 3 Select: SNES L
Player 3 Start: SNES R
One couldn't use three players in such a game then, right?
Correct. Only two players would be possible in a game using Super NES controllers through SNES-to-NES adapters. Were you thinking three players as in Jeopardy! or three players as in Secret of Mana? Because if you have three 16-pixel-wide players in one place, anything around them will start to flicker due to the PPU's limit of 64 sprite pixels per scanline.
Quote:
Correct. Only two players would be possible in a game using Super NES controllers through SNES-to-NES adapters. Were you thinking three players as in Jeopardy! or three players as in Secret of Mana? Because if you have three 16-pixel-wide players in one place, anything around them will start to flicker due to the PPU's limit of 64 sprite pixels per scanline.
But 16 * 3 == 48, not 64.
FinalZero wrote:
But 16 * 3 == 48, not 64.
Yeah, which means that there's not much left for other game objects. I guess you could design levels with lots of platforms and few wide plain areas, so that the 3 players don't stay horizontally aligned very often.
Actually, when you think about it, the NES has a lot of games that defy the sprites per scanline limit, but were made anyway. Look at Double Dragon for example: you often have 3 people (who are often wider than 16 pixels, BTW, specially when lying down) on the screen at once, plus weapons and even things you can throw... And the game was a big hit. From that we can conclude that a little flickering doesn't hurt anyone.
Nightmare on Elm Street gets pretty bad if you turn on 4-player mode. If you idle for a long time until "Freddy's™ Coming!", you fight against Freddy. 4 Players (each 16 pixels wide, wider if punching), the 24-32 width sprite for Freddy, and up to two random hands (16 pixels wide each) grabbing you. Tons of sprites in the same row, many sprites vanish because Rare's OAM cycling code wasn't flexible enough.
Questions again:
1) How does one set segments/banks (including the .nes header) with cc65?
2) How is the switch/case statement implemented in cc65? I'd check myself, but I can't get anything to compile...
3) In cc65, when an address is pushed onto the stack, does it take of two bytes?, thus increasing the stack size/height by two?
FinalZero wrote:
Questions again:
1) How does one set segments/banks (including the .nes header) with cc65?
By creating a memory area and a segment for each bank. See the ld65 documents, and see my
demo of how to set up banks for SGROM and SNROM.
Thank you for the example. I was looking at your linker file (nes.ini), and I'm confused. What happens when multiple segments are put into the same "memory" part? Are they just stacked together so where one ends, the next begins? What happens when multiple "memory" parts have the same value for their "start" parameter?
FinalZero wrote:
Thank you for the example. I was looking at your linker file (nes.ini), and I'm confused. What happens when multiple segments are put into the same "memory" part? Are they just stacked together so where one ends, the next begins?
Yes. The common NROM setup has one ROM memory area at $C000-$FFFF or $8000-$FFFF, and the CODE, RODATA, VECTORS, and optional DMC segments all feed into it.
Quote:
What happens when multiple "memory" parts have the same value for their "start" parameter?
They're treated as
overlays, which works nicely for architectures with bankswitching. They're stored consecutively in the ROM, but exported labels within these segments end up overlapping.
Quote:
Yes. The common NROM setup has one ROM memory area at $C000-$FFFF or $8000-$FFFF, and the CODE, RODATA, VECTORS, and optional DMC segments all feed into it.
Okay so a "memory area" is really just another word for a "bank", right?
What is a bank? Is it the area that an overlay is switched into?, or the overlay itself?
Quote:
but exported labels within these segments end up overlapping.
So the labels refer to where the code would be if they weren't stacked/stored consecutively?
Also, since I had asked about extended buttons (XYLR) before, is there any way for an NES to use the rumble feature of a gamepad?
Bank, or page, is a piece of large ROM/RAM that could be switched (mapped) into a part of the CPU address space that is sometimes called 'window'.
Quote:
Bank, or page, is a piece of large ROM/RAM that could be switched (mapped) into a part of the CPU address space that is sometimes called 'window'.
Ah, okay. Thank you.
I was reading
http://www.cc65.org/doc/ld65-5.html#ss5.1 , and I'm confused. It says:
Quote:
So, because we specified that the segment with the name BSS is of type bss, the linker knows that this is uninitialized data, and will not write it to an output file.
Umm, how does it do anything if it isn't written to an output file?
FinalZero wrote:
I was reading
http://www.cc65.org/doc/ld65-5.html#ss5.1 , and I'm confused. It says:
Quote:
So, because we specified that the segment with the name BSS is of type bss, the linker knows that this is uninitialized data, and will not write it to an output file.
Umm, how does it do anything if it isn't written to an output file?
BSS segments are usually used for variables. They are in the RAM, and they aren't initialized to any value, so they don't have to be stored in the ROM/the output file.
Quote:
BSS segments are usually used for variables. They are in the RAM, and they aren't initialized to any value, so they don't have to be stored in the ROM/the output file.
Where exactly is the RAM then? and how large is it? Why aren't data and rodata segments put in RAM too? They're variables also.
In the nes.ini file:
Code:
ROM15: start = $C000, size = $4000, type = ro, file = %O, fill=yes, fillval=$FF;
What does "type = ro" do? Does it prevent writing to any of those areas?
FinalZero wrote:
Quote:
BSS segments are usually used for variables. They are in the RAM, and they aren't initialized to any value, so they don't have to be stored in the ROM/the output file.
Where exactly is the RAM then? and how large is it? Why aren't data and rodata segments put in RAM too? They're variables also.
The main RAM in NES is 2KB and it's at $000-$7FF. DATA is initialized data, the initialization values are written to the ROM, and should be copied to RAM by code at reset time. RODATA is not for variables, it's for data whose value doesn't change.
Quote:
In the nes.ini file:
Code:
ROM15: start = $C000, size = $4000, type = ro, file = %O, fill=yes, fillval=$FF;
What does "type = ro" do? Does it prevent writing to any of those areas?
I actually don't know why the linker differentiates between "ro" and "rw", as it's not possible for the compiler to actually check or prevent writing to any segments/memory areas. I think it might be just for checking that the user isn't accidentally putting a "rw" segment in a "ro" memory area.
Quote:
The main RAM in NES is 2KB and it's at $000-$7FF.
The zeropage is $00-$FF, the stack is $100-$1FF, so $200-$7FF is for variables, right?
Quote:
DATA is initialized data, the initialization values are written to the ROM, and should be copied to RAM by code at reset time.
So, RODATA is only stored in ROM, not RAM?
Quote:
RODATA is not for variables, it's for data whose value doesn't change.
The word "variable" is misleading here. I don't mean it in the sense of "something that varies/changes", but something that sits in memory, I suppose. That is, even declaring something with "static const" in C is a "variable". I don't know what other word to use.
Quote:
I actually don't know why the linker differentiates between "ro" and "rw", as it's not possible for the compiler to actually check or prevent writing to any segments/memory areas. I think it might be just for checking that the user isn't accidentally putting a "rw" segment in a "ro" memory area.
Okay.
FinalZero wrote:
Quote:
The main RAM in NES is 2KB and it's at $000-$7FF.
The zeropage is $00-$FF, the stack is $100-$1FF, so $200-$7FF is for variables, right?
Correct. But my link scripts tend to assign $300-$7FF as the memory area in which to put BSS because that way, $200-$2FF can hold the display list that gets copied to OAM every vblank.
Quote:
Quote:
DATA is initialized data, the initialization values are written to the ROM, and should be copied to RAM by code at reset time.
So, RODATA is only stored in ROM, not RAM?
Correct. NES linker scripts generally provide the following segments:
- CODE: in ROM
- RODATA: in ROM
- DMC: in ROM
- VECTORS: near the very end of ROM
- CODE0 through CODE6, RODATA0 through RODATA6: like CODE and RODATA but designed to be paged out by a mapper
- DATA: stored in ROM, but addresses are in RAM; copied from ROM to RAM in your init code
- BSS: not stored anywhere, and addresses are in RAM; initialized to zero in your init code (per C standard) or not initialized at all (per tokumaru)
- ZEROPAGE: a second BSS in $0010-$00FF useful for pointers and the most frequently accessed variables
Quote:
The word "variable" is misleading here. I don't mean it in the sense of "something that varies/changes", but something that sits in memory, I suppose. That is, even declaring something with "static const" in C is a "variable". I don't know what other word to use.
I've been referring to objects in RODATA as "tables".
Quote:
Correct. But my link scripts tend to assign $300-$7FF as the memory area in which to put BSS because that way, $200-$2FF can hold the display list that gets copied to OAM every vblank.
Okay.
Quote:
Correct. NES linker scripts generally provide the following segments:
CODE: in ROM
RODATA: in ROM
DMC: in ROM
VECTORS: near the very end of ROM
CODE0 through CODE6, RODATA0 through RODATA6: like CODE and RODATA but designed to be paged out by a mapper
DATA: stored in ROM, but addresses are in RAM; copied from ROM to RAM in your init code
BSS: not stored anywhere, and addresses are in RAM; initialized to zero in your init code (per C standard) or not initialized at all (per tokumaru)
ZEROPAGE: a second BSS in $0010-$00FF useful for pointers and the most frequently accessed variables
Isn't RODATA tied to a specific bank then? Is accessing them in ROM slower then doing so in RAM? Who's Tokumaru?
Quote:
I've been referring to objects in RODATA as "tables".
The name implies... a table (like in HTML) though.
FinalZero wrote:
Who's Tokumaru?
That would be me. I defend the idea that memory shouldn't be cleared at the start of the program, because it makes cases when parts of the program use uninitialized memory (i.e. bugs) hard to catch. A lot of people don't agree with me though. =)
FinalZero wrote:
Isn't RODATA tied to a specific bank then?
Many mappers have a "fixed bank", or a memory area to put segments that should always remain available, at $C000-$FFFF. On these mappers, no matter which bank is switched in, the last bank in the cartridge will always be present at $C000-$FFFF.
Quote:
Is accessing them in ROM slower then doing so in RAM?
ROM and RAM run at the same speed on the NES because the NES has no wait states. In fact, there are other platforms where ROM can be accessed faster than (certain parts of) RAM, such as the Super NES and the Game Boy Advance, because DRAM access introduces a wait state. But in some cases, such as odd pointer manipulation tricks needed to pull data from a table, copying select things to RAM might make them faster to access. And of course, anything compressed may need to be copied to RAM as it is decompressed, though some "random access" compression schemes allow a program to keep only part of the decompressed data in RAM.
Quote:
Who's Tokumaru?
Another regular on this BBS, from Brazil.
Quote:
Quote:
I've been referring to objects in RODATA as "tables".
The name implies... a table (like in HTML) though.
That or a
table in SQL, or a
lookup table.
Quote:
Many mappers have a "fixed bank", or a memory area to put segments that should always remain available, at $C000-$FFFF. On these mappers, no matter which bank is switched in, the last bank in the cartridge will always be present at $C000-$FFFF.
Okay. Also, do some games that have multiple large amounts of data, like maps, opt to store it in switchable banks instead, swapping them in and out as needed?
Quote:
ROM and RAM run at the same speed on the NES. But in fact, there are other platforms where ROM can be accessed faster than RAM, such as the Super NES and the Game Boy Advance.
That is strange. Is there a reason why? a motive behind such a thing?
Quote:
But in some cases, such as odd pointer manipulation tricks needed to pull data from a table, copying select things to RAM might make them faster to access. And of course, anything compressed may need to be copied to RAM as it is decompressed, though some "random access" compression schemes allow a program to keep only part of the decompressed data in RAM.
Hmm, okay.
FinalZero wrote:
do some games that have multiple large amounts of data, like maps, opt to store it in switchable banks instead, swapping them in and out as needed?
Soytainly. That's what banks are for.
Quote:
Quote:
there are other platforms where ROM can be accessed faster than RAM
That is strange. Is there a reason why? a motive behind such a thing?
To make the console cheap. Cheap DRAM is slower than certain mask ROM technologies.
tepples wrote:
FinalZero wrote:
do some games that have multiple large amounts of data, like maps, opt to store it in switchable banks instead, swapping them in and out as needed?
Soytainly. That's what banks are for.
Quote:
Quote:
there are other platforms where ROM can be accessed faster than RAM
That is strange. Is there a reason why? a motive behind such a thing?
To make the console cheap. Cheap DRAM is slower than certain mask ROM technologies.
Ah okay.
* * *
So, another question:
1) What's the usual method/s to produce random numbers on the NES?
I dunno about others, but I just add a bunch of controller dependent and also engine states together. It's only the same if you don't press any buttons, but then you can't start the game so it works pretty well once you get into the game. If there's a better way please let me know. I made a topic back on it before, maybe I should reread it now since I probably will understand the concepts better myself.
Any simple pseudorandom algorithm will work. Galois one is very simple and work well for simple games:
Code:
rand:
lda <RAND_SEED
asl a
bcc @1
eor #$cf
@1:
sta <RAND_SEED
rts
RAND_SEED is a RAM location that contains non-zero value on start. To randomize, call rand on your title screen or in other similar place all the time.
Quote:
I dunno about others, but I just add a bunch of controller dependent and also engine states together. It's only the same if you don't press any buttons, but then you can't start the game so it works pretty well once you get into the game. If there's a better way please let me know. I made a topic back on it before, maybe I should reread it now since I probably will understand the concepts better myself.
But what if it's an RPG? People usually aren't button mashing in one of those. Golden Sun's RNG method comes to mind. (
http://goldensunwiki.net/RNG ) The downside of course is that a player could study and abuse it.
Quote:
Any simple pseudorandom algorithm will work. Galois one is very simple and work well for simple games:
Is this what you're talking about?:
http://en.wikipedia.org/wiki/Linear_fee ... lois_LFSRs I've never heard of it before. When I get some time I'll read up about it.
For RPG you would need a 16-bit (or even 32-bit) RNG with good uniform distribution (i.e. all the numbers has equal probability). It may be slower than simple RNGs.
In any case, you need to find a compromise between pseudorandomness and possibility of manipulation. I.e. if you don't involve any randomization based on user's input, you only can have a pseudorandom sequence, which is always the same. Randomization based on user's input opens possibility of manipulation.
FinalZero wrote:
1) What's the usual method/s to produce random numbers on the NES?
My latest recommendation is to keep a 2-byte (16-bit) random seed and use
Greg Cook's implementation of CRC16 as your PRNG. This is like the Galois LFSR that Shiru mentioned, except faster because it works a byte at a time instead of a bit at a time. In your NMI handler, count vblanks at all times. When the player presses start at the title screen, mix the count into the CRC:
Code:
lda vblank_count
jsr CRC16_F
When the player chooses to load a saved game, mix the count in again. After you've mixed in at least two bytes of entropy, you can put zeroes in and get pseudorandom numbers out:
Code:
lda #0
jsr CRC16_F
I used to do the same thing with
bit-at-a-time CRC32 as in ZIP. It's slower, but speed is acceptable if you need a really long period and don't need a lot of random numbers per frame.
Some more questions:
1) What does one use interrupts for? (the BRK and RTI instructions)
2) How does one emulate the remainder/modulo function on the 6502?
3) As a corollary, how does one read a number, and print it off with a decimal base?
4) Related, how does one use the debugger in FCEUX? I mean, I know how to set breakpoints and , but how would I find/detect/mark something like a game's routine to convert numbers to decimal?
FinalZero wrote:
1) What does one use interrupts for? (the BRK and RTI instructions)
Interrupts exist to handle events that must be dealt with "right away". Some events are so important that they require immediate attention of the CPU, so they literally interrupt the code that the CPU is running and have it execute the code pointed by the corresponding interrupt vector. Of course that the address of the code that was running before is pushed to the stack, so that an RTI instruction can return from the interrupt and the CPU can resume its previous work.
On the NES, interrupts are used to indicate the beginning of the vertical blank (the PPU can optionally fire an NMI when VBlank starts), to create raster effects (several mappers can generate IRQs when specific parts of the display are being rendered), and to indicate that a sample has finished playing (the APU can optionally generate an IRQ in this case).
Quote:
2) How does one emulate the remainder/modulo function on the 6502?
If it's a division by a power of 2, you can just mask out (using the AND instruction) the highest bits of the number, otherwise you'll have to implement an actual division routine. The "shift and subtract" method is very common, look it up on Google.
Quote:
3) As a corollary, how does one read a number, and print it off with a decimal base?
You'll have to convert the number from BIN/HEX to decimal. A few years ago we had
a lengthy discussion about this.
Quote:
4) Related, how does one use the debugger in FCEUX? I mean, I know how to set breakpoints and , but how would I find/detect/mark something like a game's routine to convert numbers to decimal?
If you want to pinpoint a specific routine you have to do it backwards. In this case, you should first find the position on the screen where a decimal number appears, so that you can set up a breakpoint on writes to that location (click the "Add" button, write the name table address of one of the decimal digits in the first box, check "Write", check "PPU Mem"). Emulation will paue when a write to that location happens, at which point you can take a look at the code. If you are lucky it will be the code that does the conversion, but most likely not.
Most games wouldn't waste VBlank time with such a conversion, so the code is probably just copying the previously converted number to the name table. But at least you'll know where the number number is stored, so you can set up another breakpoint, this time on writes to the "CPU Mem" where the converted number is. Hopefully this will allow you to locate the conversion routine. If it doesn't, just keep going.
Keep in mind that some games don't convert numbers at all, they just store the individual decimal digits (1 or 2 per byte), and they specifically check for values over 9 or under 0 in order to perform carry and borrow operations. If the numbers are not used in complex math operations (anything other than adding and subtracting), it's simpler that way.
FinalZero wrote:
1) What does one use interrupts for? (the BRK and RTI instructions)
The BRK instruction isn't very useful on the NES, but as tokumaru pointed out, /IRQ is useful for raster effects.
Quote:
2) How does one emulate the remainder/modulo function on the 6502?
Division must be done in software. There are a few subroutines for this on 6502.org.
Quote:
3) As a corollary, how does one read a number, and print it off with a decimal base?
16-bit conversion in 700 cycles is
on the wiki.
8-bit conversion in 84 cycles is in src/bcd.s of
Concentration RoomQuote:
[In an NES debugger,] how would I find/detect/mark something like a game's routine to convert numbers to decimal?
There are two ways to do this: backward and forward.
Forward: Use the cheat finder to find the address holding the number being converted to decimal, such as hit points. Then set a read breakpoint on the address; one of the reads will happen shortly before jsr binary_to_decimal.
Or backward, as tokumaru mentioned: Find where on the nametable the decimal number is stored. Set a write breakpoint there, and see where it's copying from. Set another write breakpoint on the copy source, and you'll probably be inside binary_to_decimal.
Thanksgiving break is pretty much here, so I think that I'll dare to try to make a "hello world" program now (again). I understand bank-switching must better now.
* * *
In the mean time, a couple questions:
1) What NES games had well-done cutscenes? (The Ninja Gaiden games come to mind for me, and to lesser extent, Double Dragon II)
2) What games used the microphone thing that the famicom's controller had? (Besides Zelda.) How does one program the game to use it?
3) Is there a way to compactly fit a font that has letters that don't fit on an 8x8 tile? RPG's rely heavily on text, yet their fonts in the NES era were so ugly and hard to read.
4) Does the SNES use bank-switching? How much space can it address?
5) How does scrolling work? What are nametables? In an overhead view common to RPGs, how does one move a character gradually to the next tile?, instead of "jumping" there without a smooth transition. Is there a timer involved?
6) What do conversions to/from NTFS/PAL entail? Is the NOP instruction used to insert/pad cycles? Or must things be completely rewritten somehow?
FinalZero wrote:
2) What games used the microphone thing that the famicom's controller had? (Besides Zelda.) How does one program the game to use it?
I do not know of any other games using the microphone, but AFAIK you can read its state one bit at a time through $4016. I'm not sure if this bit just means "there's sound" vs. "there's no sound" or if it's a delta of some sort, but either way the microphone seems to be pretty useless.
Quote:
3) Is there a way to compactly fit a font that has letters that don't fit on an 8x8 tile? RPG's rely heavily on text, yet their fonts in the NES era were so ugly and hard to read.
Games that use mappers with CHR-ROM bankswitching can easily switch to banks with big, complex fonts when rendering text boxes, so they don't need to sacrifice any of the 256 background tiles for text. Another option, when using CHR-RAM, is to dedicate a few tiles to a rectangular area where you can dynamically render your text. This way you can even have letters of different widths (i.e. an "I" doesn't have to occupy an area as wide as an "M"), but it does cost a few background tiles.
Quote:
4) Does the SNES use bank-switching? How much space can it address?
I don't know much about the SNES, so i'll say what I know about its competitor, the Genesis, which can address up to 4MB, and games larger than that do use bankswitching.
Quote:
5) How does scrolling work?
You write to a few registers ($2000 and $2005, but also $2006 in exceptional cases) to indicate where exactly in the name tables the rendering of the next frame will start. Whatever point you pick will show up at the top left corner of the screen. By modifying that point over time you create the illusion of scrolling. As old parts of the name tables are left behind and new parts are revealed, the areas that are off screen have to be progressively redrawn if you want to scroll over an area larger than the area covered by the 2 (or 4) nametables (levels in most scrolling games are larger than 2 or 4 screens).
Quote:
What are nametables?
A name table is just a grid of tiles. It's a 2-dimensional structure that holds indexes of tiles. When rendering the background, the PPU reads those indexes and draws the specified tiles at the specified positions.
Quote:
In an overhead view common to RPGs, how does one move a character gradually to the next tile?, instead of "jumping" there without a smooth transition. Is there a timer involved?
The NES runs at 60 frames per second, enough for incredibly smooth animation. Games work by computing one frame at a time, so all you have to do is increment the positions of the objects/characters just a little bit each step.
Quote:
6) What do conversions to/from NTFS/PAL entail?
Say that you want an object to cross the screen horizontally in 1 second, no matter if the console is NTSC or PAL. The screen is 256 pixels wide, and there are 60 frames per second in NTSC, so this means that the object has to move 4.26666 pixels per frame. For PAL, that would be 256 / 50 = 5.12 pixels per frame. PAL is slower, so you have to move everything a bit faster to compensate. Basically you have to adjust all displacements and delays in accordance to the frame rate.
Quote:
Is the NOP instruction used to insert/pad cycles?
Only in timed loops. For example, if you need to wait 8 scanlines for some reason, that's 909 CPU cycles in NTSC, but only 852 in PAL, so the cycles have to be adjusted if you are using this kind of delay. This is mostly for raster effects (split screens, status bars, etc.) though, and has nothing to do with game logic or animations.
FinalZero wrote:
3) Is there a way to compactly fit a font that has letters that don't fit on an 8x8 tile?
Yes. Blargg and I worked on a VWF (variable-width font) engine for NES several years back. I'm using it in a tech demo whose source I plan to release in a few months if you can answer one question. Who's cuter: Hanson or Alvin and the Chipmunks?
Quote:
4) Does the SNES use bank-switching?
Not in the same sense as that used by NES mappers.
Quote:
How much space can it address?
Just under 12 MiB. But by the time mask ROMs of that size became affordable, the Nintendo 64 was already popular.
Quote:
5) How does scrolling work?
From the programmer's perspective, it's rawther simple:
Code:
lda x_coordinate
sta $2005
lda y_coordinate
sta $2005
This changes which part of the map the player can see, in effect moving the camera. Scrolling consists of moving the camera a small amount every frame and writing new data to the parts of the map that were just revealed, as seen in
this diagram:
From the hardware perspective: Scrolling on the NES consists of changing the starting VRAM address for background rendering, along with changing the tap from a delay line that delays the background tiles by 1 to 8 pixels. When you write the background scroll position to $2005 at the end of your VRAM update code, circuitry inside the PPU translates the lower 3 bits of the X coordinate to the delay amount and the rest of the bits to a VRAM address.
Quote:
What are nametables? In an overhead view common to RPGs, how does one move a character gradually to the next tile?, instead of "jumping" there without a smooth transition. Is there a timer involved?
The timer involved in most sprite animation on the NES is an interrupt that signals the start of vertical blanking. The PPU asserts this about 60.1 times a second on NTSC or 50 times a second on PAL.
Quote:
6) What do conversions to/from NTFS/PAL entail?
NTFS is the file system used by Windows for fixed disks. You probably meant NTSC. Conversions involve changing the pitch and speed of the music and the distance that game critters move every vblank. If a game uses heavy raster effects, the CPU timing for those is altered as well. If a game originally for PAL uses all of the PAL system's greater video memory bandwidth (which is about 3 times as much on a PAL system due to the longer vblank period), the programmers have to use raster effects to squeeze all the updates into the shorter vblank period. This is why Asterix fails on an NTSC system and why Battletoads is a pain to get right in emulators.
Quote:
I'm using it in a tech demo whose source I plan to release in a few months if you can answer one question. Who's cuter: Hanson or Alvin and the Chipmunks?
Um, I don't know who either of those are.
Quote:
NTFS is the file system used by Windows for fixed disks. You probably meant NTSC
Ah, yes, I did. I had to give a speech about hard drives today, so please forgive me!
Another question: What's a "vblank"? Is it just the period when the screen is redrawn?
FinalZero wrote:
Quote:
I'm using [a VWF engine] in a tech demo whose source I plan to release in a few months if you can answer one question. Who's cuter: Hanson or Alvin and the Chipmunks?
Um, I don't know who either of those are.
Popular music acts known for their squeaky voices, but that's beside the point.
Quote:
What's a "vblank"? Is it just the period when the screen is redrawn?
In video, the
vertical blanking interval (vblank) is a short time between one picture and the next. The CPU can write data to the PPU only during forced blanking (when the CPU has told the PPU to stop rendering and blank the screen) or during vblank. Battletoads works around the shorter NTSC vblank by introducing an area of forced blanking near the top of the screen, but that's an advanced trick because such programming must be very careful to use a constant amount of time.
Isn't the only way for SNES to officially bankswitch is to use a Capcom chip? (Used in Megaman X, and other titles.)
Quote:
In video, the vertical blanking interval (vblank) is a short time between one picture and the next. The CPU can write data to the PPU only during forced blanking (when the CPU has told the PPU to stop rendering and blank the screen) or during vblank. Battletoads works around the shorter NTSC vblank by introducing an area of forced blanking near the top of the screen, but that's an advanced trick because such programming must be very careful to use a constant amount of time.
Does it have to be timed in a certain way?
More Questions:
1) How does one increment or decrement the accumulator? There's no INA or DEA instruction? Must one just use ADC instead? Looking at documentation, it still only takes 2 cycles, but it's a byte longer.
2) Are there any undocumented/illegal instructions that are used with any frequency?
http://www.obelisk.demon.co.uk/6502/reference.html#JMP wrote:
An original 6502 has does not correctly fetch the target address if the indirect vector falls on a page boundary (e.g. $xxFF where xx is and value from $00 to $FF). In this case fetches the LSB from $xxFF as expected but takes the MSB from $xx00. This is fixed in some later chips like the 65SC02 so for compatibility always ensure the indirect vector is not at the end of the page.
3) Is this honored in emulators today?
FinalZero wrote:
1) How does one increment or decrement the accumulator? There's no INA or DEA instruction? Must one just use ADC instead? Looking at documentation, it still only takes 2 cycles, but it's a byte longer.
Yep, you need to use ADC to increment or decrement the accumulator. And you also need to know the initial state of carry as well, so you probably need a CLC instruction before the ADC as well.
FinalZero wrote:
2) Are there any undocumented/illegal instructions that are used with any frequency?
If you need to use a read-modify-write instruction on an otherwise unsupported addressing mode (such as the nonexistent "INC nnnn,Y", or "DEC (nn),y" instructions), and you don't care what happens to the accumulator or flags, then you can use an illegal instruction that does the read-modify-write instruction and ALU instruction at the same time.
The possible illegal instructions are these combinations: ORA+ASL, AND+ROL, EOR+LSR, ADC+ROR, CMP+DEC, and SBC+INC. These carry out the read-modify-write instruction to memory, then do the ALU instruction on register A and flags. So if you just wanted the Read-Modify-Write part done, and don't care about register A, they are possibly useful.
The CMP+DEC instruction is probably the best illegal instruction, since it doesn't affect the contents of A. But you have incorrect flags afterwards, since it also did a CMP as part of executing the instruction. You don't get the nice "my value in memory went down to zero so set the zero flag" part.
There's other illegal instructions too, but Martin Korth gave a bunch of warnings about how they didn't work consistently on his 6502-based Commodore 64/128 computers. They might work okay on the NES though. The ALU+Read-Modify-Write instructions seem stable enough though.
FinalZero wrote:
http://www.obelisk.demon.co.uk/6502/reference.html#JMP wrote:
An original 6502 has does not correctly fetch the target address if the indirect vector falls on a page boundary (e.g. $xxFF where xx is and value from $00 to $FF). In this case fetches the LSB from $xxFF as expected but takes the MSB from $xx00. This is fixed in some later chips like the 65SC02 so for compatibility always ensure the indirect vector is not at the end of the page.
3) Is this honored in emulators today?
That just means don't use the Indirect Jump instruction "JMP (nnnn)" with a value of nnnn that ends with FF. Your target address can still be anything with no problem. And the regular jump instruction "JMP nnnn" can have any jump target with no problem.
Hamtaro126 wrote:
Isn't the only way for SNES to officially bankswitch is to use a Capcom chip? (Used in Megaman X, and other titles.)
SNES only bankswitches with certain chips and Capcom's Cx4 is not one of them. I recall SDD-1 and SA-1 having banking capabilities. The SNES generally doesn't need banking because the 65816 has a 24bit address space divided into 256 64K banks. Typical mapping allow for up to 32 megabits (4 megabytes) of ROM and more complicated mapping allows for about 96 megabits which I think is about 12 megabytes.
The real impressive part of bankswitching on the NES is doing so for the CHR/graphics. By doing so you can instantly load new animation frames for characters on screen or to animate backgrounds with different patterns rather than by changing the palette to give an illusion which is sometimes done.
Quote:
That just means don't use the Indirect Jump instruction "JMP (nnnn)" with a value of nnnn that ends with FF. Your target address can still be anything with no problem. And the regular jump instruction "JMP nnnn" can have any jump target with no problem.
But who programs with the hex directly? Instead, won't one have to check the output of the assembler after every assemblation?
Quote:
or to animate backgrounds with different patterns rather than by changing the palette to give an illusion which is sometimes done.
I recall the second Double Dragon game doing that to animate conveyor belts. I don't know whether it it was bank switching or not though.
Usually, JMP (xxxx) is done with an address in RAM. When you defined your address, you probably have a general idea of where your variables are. Yes, it screws up if xxxx and xxxx+1 are on a different page, but I usually stick both bytes in the zeropage anyway.
I can't think of many situations where you'd want a 16-bit address sitting in ROM, and using a JMP (xxxx) to go there indirectly.
Conveyor belts are usually animated with palette animation, but I didn't play Double Dragon 2, so don't know how the game animated them.
FinalZero wrote:
But who programs with the hex directly? Instead, won't one have to check the output of the assembler after every assemblation?
Not if you use .align wisely.
Dwedit wrote:
Conveyor belts are usually animated with palette animation.
Unless you have MMC3 or CHR RAM and you can poke in different tiles like SMB3 does.
I was watching
http://www.youtube.com/watch?v=bBE4KHKzhKc . Can someone explain what he's explaining at 9:10? I don't understand what he's saying.
He's saying some instructions are Read Write Modify. Meaning on the hardware level they read memory, then that write it back, then they actually change it and write it again. Something like Bill and Ted's adventure makes use of this with reseting the MMC1 which some odd INC opcode which causes an edge case behavior in the MMC1. If you don't emulate this correctly the game won't work. If you are programming for NES this basically has no meaning to you.
He's just saying that, internally, these read-modify-write instructions behave a little differently than programmers think they do. For us programmers, it's logical to think that these instructions read the value from memory, do some sort of operation on then, and finally write the modified value back to memory, but the way the instructions were programmed in the CPU actually causes them to write the original value back to memory before writing the modified value. It was probably easier/cheaper to make the CPU that way, and since there are no disadvantages to this behavior, everything was OK.
It really doesn't make much difference for programmers, but if you look at the hardware level you can detect the the CPU is indeed doing this, and someone might be interested in using these consecutive writes for some obscure purpose. I seem to remember people here mentioning a NES game writing to mapper registers with RMW instructions (INC or DEC maybe).
This is not very useful, and it's really advanced stuff for beginners, so I'm not sure if you should be bothering about things like these at this time.
A couple more questions:
1) How does the overflow flag work with ADC and SBC?
2)
http://www.obelisk.demon.co.uk/6502/reference.html#ADC wrote:
If overflow occurs the carry bit is set
What does this mean exactly? What if overflow does *not* occur, is the carry bit cleared then? or simply left untouched?
3)
Quote:
On the NES, interrupts are used to indicate the beginning of the vertical blank (the PPU can optionally fire an NMI when VBlank starts), to create raster effects (several mappers can generate IRQs when specific parts of the display are being rendered), and to indicate that a sample has finished playing (the APU can optionally generate an IRQ in this case).
So, are they always used? or only sometimes used?
4) There is no SEV instruction to complement the CLV one. What does one use instead? BIT #$40 ?
5)
Quote:
It really doesn't make much difference for programmers, but if you look at the hardware level you can detect the the CPU is indeed doing this, and someone might be interested in using these consecutive writes for some obscure purpose. I seem to remember people here mentioning a NES game writing to mapper registers with RMW instructions (INC or DEC maybe).
So this doesn't mean anything to the programmer? Instructions are still atomic?
ADC and SBC touch all flags NVZC.
Pretty much all NES games (except for a few obscure ones that almost nobody in my country probably has) use the NMI to detect the start of vertical blanking.
What would SEV be needed for?
BIT immediate does not exist on the original 6502.
FinalZero wrote:
What if overflow does *not* occur, is the carry bit cleared then? or simply left untouched?
The carry flag and the overflow flag are entirely different things. In a way, the C flag can be used to detect unsigned overflow, while the V flag is used to detect signed overflows. If an addition causes a number to go past 255 and wrap back to 0 and up, that‘s an unsigned overflow, and the C flag will indicate whether this happened or not. But when a number goes past 127 and wraps back to -128 (or vice versa), that‘s a signed overflow, indicated by the V flag.
Is there an idiom to detect overflow when adding a signed number to an unsigned number, such as when adding an instantaneous velocity to a critter's position?
tepples wrote:
Is there an idiom to detect overflow when adding a signed number to an unsigned number, such as when adding an instantaneous velocity to a critter's position?
I take the the unsigned number and adc the positive value of the signed number if it's positive or sbc the positive value of the signed number if it's negative. At least that's what I'm planning to do on my objects core. I think this way you can use the normal flags.
Quote:
ADC and SBC touch all flags NVZC.
I mean, what is the condition that causes the overflow to be set for those instructions?
Quote:
Pretty much all NES games (except for a few obscure ones that almost nobody in my country probably has) use the NMI to detect the start of vertical blanking.
Okay.
Quote:
What would SEV be needed for?
That is a good question.
Quote:
BIT immediate does not exist on the original 6502.
How about some place in memory then?
FinalZero wrote:
Quote:
ADC and SBC touch all flags NVZC.
I mean, what is the condition that causes the overflow to be set for those instructions?
C: Bit 8 of the sum, that is, (sum >> 8) & 1.
N: Bit 7 of the sum, that is, (sum >> 7) & 1.
Z: Bits 7-0 of the sum NOR'd together, that is, !(sum & 0xFF).
V: I'll let
someone else explain, if you don't mind.
Quote:
Quote:
BIT immediate does not exist on the original 6502.
How about some place in memory then?
You can BIT any RTS to set N to 0 and V to 1.
Quote:
The carry flag and the overflow flag are entirely different things. In a way, the C flag can be used to detect unsigned overflow, while the V flag is used to detect signed overflows. If an addition causes a number to go past 255 and wrap back to 0 and up, that‘s an unsigned overflow, and the C flag will indicate whether this happened or not. But when a number goes past 127 and wraps back to -128 (or vice versa), that‘s a signed overflow, indicated by the V flag.
Yes, I know that already. By "overflow", I meant unsigned overflow. Is there a term like "carryage" or something to use instead?
So, again: If there is no *unsigned* overflow, is the carry bit cleared then? or simply left untouched?
Quote:
V: I'll let someone else explain, if you don't mind.
Thank you, though I'm still not sure how to add two signed 16-bit integers.
Quote:
You can BIT any RTS to set N to 0 and V to 1.
You mean BIT an RTS ($60) instruction? That's sort of breaking the distinction between code and data...
What does "page" mean in documentation like:
http://6502.org/tutorials/6502opcodes.html wrote:
+ add 1 cycle if page boundary crossed
Does it mean bank?
FinalZero wrote:
By "overflow", I meant unsigned overflow. Is there a term like "carryage" or something to use instead?
I don't think the name maters much as long as you understand what's going on underneath.
Quote:
So, again: If there is no *unsigned* overflow, is the carry bit cleared then? or simply left untouched?
It's cleared.
Quote:
Thank you, though I'm still not sure how to add two signed 16-bit integers.
It's exactly the same as if they were unsigned. In some cases, like when comparing multi-byte signed integers, you must treat the lower bytes as unsigned, and only the most significant byte as signed (i.e., the carry flag is used to propagate overflows up until the highest byte, and only when the last byte has been calculated you can expect something meaningful in the V flag).
Quote:
Quote:
You can BIT any RTS to set N to 0 and V to 1.
You mean BIT an RTS ($60) instruction? That's sort of breaking the distinction between code and data...
I also have no idea what tepples meant with this comment...
Quote:
What does "page" mean in documentation like:
http://6502.org/tutorials/6502opcodes.html wrote:
+ add 1 cycle if page boundary crossed
Does it mean bank?
The CPU doesn't know anything about "banks", so no. Pages are sections of 256 bytes. $8000 is the start of one page, $8100 is the start of the following page, for example. Some instructions take longer to execute when pages are crossed because the CPU needs more time in order to update the high byte of the address as well as the low byte.
tokumaru wrote:
Quote:
Quote:
You can BIT any RTS to set N to 0 and V to 1.
You mean BIT an RTS ($60) instruction? That's sort of breaking the distinction between code and data...
I also have no idea what tepples meant with this comment...
I'm sure he meant exactly what it looks like, making up for the lack of a SEV instruction by using BIT against some value of $60 sitting somewhere in the code. RTS is $60.
Ah, I see... The way it was worded really threw me off! And I just noticed FinalZero did understand what he meant, I was the only one that was confused!
Quote:
The CPU doesn't know anything about "banks", so no. Pages are sections of 256 bytes. $8000 is the start of one page, $8100 is the start of the following page, for example. Some instructions take longer to execute when pages are crossed because the CPU needs more time in order to update the high byte of the address as well as the low byte.
Ah, okay. So then, if it crosses 2 pages, does that mean it takes 2 more cycles? Are there any time-critical procedures that are must be explicitly designed not to cross page boundaries?
It can't cross more than 1 page. "Crossing a page" means when you add an 8-bit value to a 16-bit value, it needs to change the high byte of the 16-bit value. The only time you see crossing a page penalties is in the instructions that add X or Y to an address, or the branch instructions.
Dwedit wrote:
It can't cross more than 1 page. "Crossing a page" means when you add an 8-bit value to a 16-bit value, it needs to change the high byte of the 16-bit value. The only time you see crossing a page penalties is in the instructions that add X or Y to an address, or the branch instructions.
Okay. I see.
Questions:
1) When linking, the ines header goes at $0000. To play the game on an emulator, does that mean that all the other segments must be offset by $10? That is, the zero page starts at $10 and ends at $110 instead of it's normal range?
2) On starting up, what's the status of all the flags? Is it common practice to simply clear them? What about the PC/SP/AC/X/Y ?
3) Somewhat unrelated, but I'll ask anyways. I have the opportunity to get my hands on a real NES, but it doesn't work. Iirc, the power cable is broken/touchy (It needed to be bent in just the right position for it to work.), and the screen will just blink/oscillate when I did manage to get the power to work. What kind of thing would cause the second problem, and how much would replacing the first cost?
1) Only ROM ends up offset after the header, PRG, and CHR are concatenated. For example, $C000-$FFFF of an NROM-128 becomes $0010-$400F in the iNES file, or $8000-$FFFF of an NROM-256 or CNROM becomes $0010-$800F in the iNES file.
BSS-type segments such as zero page do not move because they're not stored in the ROM at all.
2) Do not depend on the flags at the time of reset. You'll usually end up clearing X, S, and half of P in the first six instructions anyway. All you can be sure of is that PC points where $FFFC-$FFFD says.
Code:
reset:
sei
ldx #$FF
txs
inx
stx $2000
stx $2001
3) The official NES adapter outputs 9 V AC, but it can run just as easily on the -9 V DC that a power supply for the original model Sega Genesis makes. Or you could buy a universal adapter and configure it for -9 V DC.
Quote:
2) Do not depend on the flags at the time of reset. You'll usually end up clearing X, S, and half of P in the first six instructions anyway. All you can be sure of is that PC points where $FFFC-$FFFD says.
What do you mean "where $FFFC-$FFFD says."? How do I set where? Why are you setting the interrupt flag in your code?
Quote:
1) Only ROM ends up offset after the header, PRG, and CHR are concatenated. For example, $C000-$FFFF of an NROM-128 becomes $0010-$400F in the iNES file, or $8000-$FFFF of an NROM-256 or CNROM becomes $0010-$800F in the iNES file. BSS-type segments such as zero page do not move because they're not stored in the ROM at all.
Okay, so that's the overlay thing again. I don't understand which segments overlay what. Is it by bank/memory-area? Basically, I still don't understand how my linker file is supposed to be set up.
FinalZero wrote:
Quote:
2) Do not depend on the flags at the time of reset. You'll usually end up clearing X, S, and half of P in the first six instructions anyway. All you can be sure of is that PC points where $FFFC-$FFFD says.
What do you mean "where $FFFC-$FFFD says."? How do I set where?
That's one of the "vectors". How you set up the vectors depends on your assembler. I remember how only for ca65, not asm6 or nesasm.
Quote:
Why are you setting the interrupt flag in your code?
To prevent IRQs from happening during the init code. True, the CPU already sets the interrupt priority bit to 1 (NMIs only) when coming out of reset, but having an explicit SEI makes the game more compatible with badly coded multicart menus that start the game with a JMP ($FFFC) but put the interrupt priority at 0 (CLI).
Quote:
Okay, so that's the overlay thing again. I don't understand which segments overlay what. Is it by bank/memory-area? Basically, I still don't understand how my linker file is supposed to be set up.
Which assembler are you using again? If ca65, I have a
project template that sets most of this up for you.
Quote:
Which assembler are you using again? If ca65, I have a project template that sets most of this up for you.
ca65, yes. Also, I'm trying to use MMC1.
Quote:
To prevent IRQs from happening during the init code. True, the CPU already sets the interrupt priority bit to 1 (NMIs only) when coming out of reset, but having an explicit SEI makes the game more compatible with badly coded multicart menus that start the game with a JMP ($FFFC) but put the interrupt priority at 0 (CLI).
But why would one expect an interrupt? Anyways, do you clear the flag at the end of the init code then?
* * *
Here's my linker code at the moment.
Code:
MEMORY {
# Header
HEADER: start = $0000, size = $0010;
# Ram
# ZEROPAGE: start = $0000, size = $0100;
# STACK: start = $0100, size = $0100;
RAM: start = $0200, size = $0600, fill = yes;
# Rom
LOWER_PROG_ROM: start = $8000, size = $2000, fill = yes;
LOWER_CHAR_ROM: start = $A000, size = $1000, fill = yes;
UPPER_CHAR_ROM: start = $B000, size = $1000, fill = yes;
UPPER_PROG_ROM: start = $C000, size = $4000, fill = yes;
}
#-------------------------------------------------------------------------------
SEGMENTS {
# Header
HEADER: load = HEADER, type = ro;
# Ram
# ZEROPAGE: load = ZEROPAGE, type = zp;
# STACK: load = STACK, type = rw;
RAM: load = RAM, type = rw;
# Rom
CODE: load = LOWER_PROG_ROM, type = ro;
FONT: load = LOWER_CHAR_ROM, type = ro;
TILES: load = UPPER_CHAR_ROM, type = ro;
MORE_CODE: load = UPPER_PROG_ROM, type = ro;
VECTORS: load = UPPER_PROG_ROM, type = ro, start = $FFFA;
}
#-------------------------------------------------------------------------------
Do I need to include the zeropage and stack segments even if nothing gets initialized there? Also, when I open it in FCEUX, I don't see any code at $8000 in the debugger, even though there should be.
And my header:
Code:
.byte $4E, $45, $53, $1A ; "NES", eof
.byte 1 ; Number of 16 kB prog ROM Segments
.byte 1 ; Number of 8 kB char ROM Segments
.byte %00010001 ; Byte 6 (Mirroring (0xx0: Horizontal, 0xx1: Vertical), Ignored if the mapper controls mirroring.)
.byte %00000000 ; Byte 7
.byte 0 ; Number of 8 kB prog RAM Segments
.byte 0 ; Byte 9 (0: NTSC, 1: PAL)
.byte 0 ; Byte 10 (Sporadically Supported)
.byte 0, 0, 0, 0, 0 ; Filler
;-------------------------------------------------------------------------------
Does MMC1 even use 8kb PROG RAM?
FinalZero wrote:
ca65, yes. Also, I'm trying to use MMC1.
If you have only 16 or 32 KiB of PRG ROM and only 8 KiB of CHR, why are you using MMC1 and not just NROM? Is it just for the switchable nametable mirroring?
Quote:
But why would one expect an interrupt?
Sources of IRQs on the NES other than mappers include the APU frame IRQ and the DMC completion IRQ. A few games on simple mappers without their own scanline or CPU cycle counter circuits (ab)use the DMC completion IRQ as a crude scanline counter.
Quote:
Anyways, do you clear the flag at the end of the init code then?
A lot of games appear to just leave the ignore IRQs flag turned on (SEI) because they never use any IRQs.
Quote:
HEADER: start = $0000, size = $0010;
For this, you may want to specify "fill=yes, fillval=$00", which means you won't need the "; Filler" line at the end of the header.
Quote:
Code:
LOWER_PROG_ROM: start = $8000, size = $2000, fill = yes;
LOWER_CHAR_ROM: start = $A000, size = $1000, fill = yes;
UPPER_CHAR_ROM: start = $B000, size = $1000, fill = yes;
UPPER_PROG_ROM: start = $C000, size = $4000, fill = yes;
[...]
I don't see any code at $8000 in the debugger
That's because with one 16 KiB bank, you need the following in order: UPPER_PROG_ROM, then LOWER_CHAR_ROM, then UPPER_CHAR_ROM. The only time CHR ROM ever comes before PRG ROM is in certain CHR RAM setups, and there is no LOWER_PROG_ROM in a 16 KiB PRG ROM. And your PRG ROM is in fact 16 KiB, as seen next:
Quote:
Code:
.byte 1 ; Number of 16 kB prog ROM Segments
.byte 1 ; Number of 8 kB char ROM Segments
Some MMC1 boards have PRG RAM; others don't. In fact, the possibility of PRG RAM doesn't depend much on the mapper; even one board in the NROM class (mapper 0) has PRG RAM.
If you don't want to spend a lot of time fixing your linker script, I recommend just using the NROM project template that I linked, which has a known working linker script.
Quote:
If you don't want to spend a lot of time fixing your linker script, I recommend just using the NROM project template that I linked, which has a known working linker script.
I want to know why though. I don't want to just be cargo-cult programming.
Quote:
And your PRG ROM is in fact 16 KiB, as seen next:
Can't I just switch it to 2 then? (That doesn't seem to do anything. The debugger still shows nothing.)
Quote:
Some MMC1 boards have PRG RAM; others don't. In fact, the possibility of PRG RAM doesn't depend much on the mapper; even one board in the NROM class (mapper 0) has PRG RAM.
What is PRG RAM used for anyways? It's not any faster than PRG ROM, is it?
FinalZero wrote:
What is PRG RAM used for anyways?
It's used for whatever programmers dissatisfied with the stock 2KB of RAM want. Many games put things like decompressed level maps there. It's just RAM, usually 8KB of it, to complement the 2KB that are built in the console.
Quote:
It's not any faster than PRG ROM, is it?
No.
PRG-RAM is the extra RAM at $6000-$7FFF. It's in the cart, sometimes battery backed up. SMB3 has it to decompress level to. Metroid also has it. So does Kid Icarus, although none of them back it up, it's just 8KB more RAM for stuff. Other games like zelda also use it for probably more space, but also saving information on the cart to continue later, which is very important in RPGs and why most have PRG-RAM with a battery like Zelda 1+2, Crystalis, Startropics, etc. MMC5 has even bankswitchable PRG-RAM, so you can have 16KB, maybe more, although nobody has ever used that much in the day so more isn't supported by any mappers but that one.
Quote:
If you have only 16 or 32 KiB of PRG ROM and only 8 KiB of CHR, why are you using MMC1 and not just NROM? Is it just for the switchable nametable mirroring?
I want to practice/try mirroring, yes.
Quote:
It's used for whatever programmers dissatisfied with the stock 2KB of RAM want. Many games put things like decompressed level maps there. It's just RAM, usually 8KB of it, to complement the 2KB that are built in the console.
Okay.
Quote:
Other games like zelda also use it for probably more space, but also saving information on the cart to continue later, which is very important in RPGs and why most have PRG-RAM with a battery like Zelda 1+2, Crystalis, Startropics, etc. MMC5 has even bankswitchable PRG-RAM, so you can have 16KB, maybe more, although nobody has ever used that much in the day so more isn't supported by any mappers but that one.
It's only that ram that's battery-backed-up though, right? What else must one do to use battery-backed-up saves?
Yeah, the 8KB is the only stuff that's backed up. And to use the back up function what you have to do is program your game/program to boot up and not clear the memory, but format it if need be and put someting in there to know it's been formatted and then use all the other data in the RAM as the state of your game and program your game to load from it.
Quote:
Yeah, the 8KB is the only stuff that's backed up. And to use the back up function what you have to do is program your game/program to boot up and not clear the memory, but format it if need be and put someting in there to know it's been formatted and then use all the other data in the RAM as the state of your game and program your game to load from it.
Okay.
Also, can someone explain exactly what's wrong with my linker file?
The first thing that's wrong is that the CHR ROM data isn't last in the file, as it MUST be. Can you paste your revised linker script that fixes this problem?
Quote:
The first thing that's wrong is that the CHR ROM data isn't last in the file, as it MUST be. Can you paste your revised linker script that fixes this problem?
I am confused. Why must it be last? Looking at
http://wiki.nesdev.com/w/index.php/MMC1 , it isn't listed last there.
Edit: Okay, wait. I'm not even sure why I put the char rom at the locations I did. It doesn't match the documentation. Where does the char rom go in MMC1?
The relevant wiki article is
iNES. In
all mappers that use CHR ROM, the CHR data comes last in the file. First the header, then the PRG ROM, then the CHR ROM.
Linker File:
Code:
MEMORY {
# Header
HEADER: start = $0000, size = $0010, fill = yes, fillval = $00;
# General Ram
# ZEROPAGE: start = $0000, size = $0100;
# STACK: start = $0100, size = $0100;
RAM: start = $0200, size = $0600, fill = yes;
EXTRA_RAM: start = $6000, size = $2000, fill = yes;
# Prog Rom
LOWER_PROG_ROM: start = $8000, size = $4000, fill = yes;
UPPER_PROG_ROM: start = $C000, size = $4000, fill = yes;
# Char Rom
LOWER_CHAR_ROM: start = $0000, size = $1000, fill = yes;
UPPER_CHAR_ROM: start = $1000, size = $1000, fill = yes;
}
#-------------------------------------------------------------------------------
SEGMENTS {
# Header
HEADER: load = HEADER, type = ro;
# Ram
# ZEROPAGE: load = ZEROPAGE, type = zp;
# STACK: load = STACK, type = rw;
RAM: load = RAM, type = ro;
DATA: load = RAM, type = rw;
RODATA: load = RAM, type = ro;
EXTRA_RAM: load = EXTRA_RAM, type = ro;
# Prog Rom
CODE: load = LOWER_PROG_ROM, type = ro;
MORE_CODE: load = UPPER_PROG_ROM, type = ro;
VECTORS: load = UPPER_PROG_ROM, type = ro, start = $FFFA;
# Char Rom
FONT: load = LOWER_CHAR_ROM, type = ro;
TILES: load = UPPER_CHAR_ROM, type = ro;
}
#-------------------------------------------------------------------------------
Header File:
Code:
.segment "HEADER"
;-------------------------------------------------------------------------------
.byte $4E, $45, $53, $1A ; "NES", eof
.byte 2 ; Number of 16 kB prog ROM Segments
.byte 1 ; Number of 8 kB char ROM Segments
.byte %00010001 ; Byte 6 (Mirroring (0xx0: H, 0xx1: V))
.byte %00000000 ; Byte 7
.byte 0 ; Number of 8 kB prog RAM Segments
;-------------------------------------------------------------------------------
It assembled, but doesn't show anything in the debugger.
I see your LOWER_PROG_ROM and UPPER_PROG_ROM sum to 32 KiB. Are you making sure to specify two 16k pages in the iNES header? And are you making sure to replicate the vectors and the beginning of the init code in all 16k banks (the
Barbie stub) so that no matter how the MMC1's registers are set at power on, the program still starts? If all this is Greek to you, I would recommend
removing LOWER_PROG_ROM entirely from your linker script and putting all your code and data in UPPER_PROG_ROM for now.
In MEMORY,
do not use fill=yes for any memory area that is RAM. This means remove it from RAM and EXTRA_RAM. If you use fill=yes, the linker will try to (uselessly) put initial values for the memory area into the ROM file. Only memory areas that represent ROM should have fill=yes.
In SEGMENTS,
use type=bss instead of type=rw for any RAM segment. For segments of type bss, the linker won't try to (uselessly) put initial values for the segment into the ROM file.
Quote:
I see your LOWER_PROG_ROM and UPPER_PROG_ROM sum to 32 KiB. Are you making sure to specify two 16k pages in the iNES header?
Yes.
Quote:
And are you making sure to replicate the vectors and the beginning of the init code in all 16k banks (the Barbie stub) so that no matter how the MMC1's registers are set at power on, the program still starts?
I didn't know that I needed to.
Quote:
If all this is Greek to you, I would recommend removing LOWER_PROG_ROM entirely from your linker script and putting all your code and data in UPPER_PROG_ROM for now.
Okay, I'll try that first.
Quote:
In MEMORY, do not use fill=yes for any memory area that is RAM. This means remove it from RAM and EXTRA_RAM. If you use fill=yes, the linker will try to (uselessly) put initial values for the memory area into the ROM file. Only memory areas that represent ROM should have fill=yes.
Okay.
Quote:
In SEGMENTS, use type=bss instead of type=rw for any RAM segment. For segments of type bss, the linker won't try to (uselessly) put initial values for the segment into the ROM file.
Why is it useless though? Isn't data stored there? I don't see why it should be any different from the CODE segment.
And... I see my code at $8000! There's stuff at $C000 too, but it's nothing I wrote, and appears to be random garbage. However, running the debugger, it seems to be alternating between $0002 and $FFFF. I don't know why.
Edit: I forgot to change the number of PRG banks in the header back to 1. Doing that, it runs code that I wrote. I think I made an error in the code though, because it isn't doing the addition that I wanted it to do.
Edit Edit: It seems to be interrupting every so often, and then returning to the wrong address. Ideas as to why?
FinalZero wrote:
Quote:
In SEGMENTS, use type=bss instead of type=rw for any RAM segment. For segments of type bss, the linker won't try to (uselessly) put initial values for the segment into the ROM file.
Why is it useless though? Isn't data stored there? I don't see why it should be any different from the CODE segment.
Data is stored in RAM, but the only reason you'd ever want to also store it in ROM is if you plan to copy it to RAM at the start of the program, such as a small piece of code related to bank switching. Here are the traditional definitions of the segments:
- CODE: stored in ROM, used in ROM
- RODATA: stored in ROM, used in ROM. Primary difference from CODE is that some architectures such as 65816 and 8086 allow for a separate program bank and data bank.
- DATA: stored in ROM, copied to RAM in init code (which isn't written automatically for you), used in RAM
- BSS: stored nowhere, cleared to zero* in init code, used in RAM
- ZEROPAGE: like BSS but in $0000-$00FF
Quote:
However, running the debugger, it seems to be alternating between $0002 and $FFFF. I don't know why.
What values end up in $FFFA through $FFFF? There should be three addresses (in the typical reverse byte order of the 6502); do they point anywhere familiar?
Quote:
Edit: I forgot to change the number of PRG banks in the header back to 1. Doing that, it runs code that I wrote. I think I made an error in the code though, because it isn't doing the addition that I wanted it to do.
Edit Edit: It seems to be interrupting every so often, and then returning to the wrong address. Ideas as to why?
Where do $FFFA-$FFFB and $FFFE-$FFFF point? Are you pushing or pulling anything in your NMI and IRQ handlers?
* Yes, tokumaru, ca65 was originally intended to support a C compiler, and the C language does specify clearing uninitialized variables in the init code. In pure assembly language programs, one can use an alternate convention that each is responsible for clearing its own memory.
Quote:
Where do $FFFA-$FFFB and $FFFE-$FFFF point? Are you pushing or pulling anything in your NMI and IRQ handlers?
Quote:
What values end up in $FFFA through $FFFF? There should be three addresses (in the typical reverse byte order of the 6502); do they point anywhere familiar?
The vectors point to an RTI, init code, and RTI, respectively. No, they only point to an NMI. At those locations are indeed the routines.
Code:
.segment "VECTORS"
;-------------------------------------------------------------------------------
.addr nmi, reset, irq
;-------------------------------------------------------------------------------
.code
;-------------------------------------------------------------------------------
.proc nmi
rti
.endproc
;-------------------------------------------------------------------------------
; Inits everything.
.proc reset
; Clears the flags.
clc
cli
clv
; Sets the stack pointer.
ldx #$FF
txs
jmp main
.endproc
;-------------------------------------------------------------------------------
.proc irq
rti
.endproc
;-------------------------------------------------------------------------------
Having IRQ pointing at an RTI will generally work. But in the NMI handler, you'll usually want to set a variable to notify the main program that a vertical blank has begun. So that I don't have to push and pull A, I just have it do inc nmis.
tepples wrote:
Having IRQ pointing at an RTI will generally work.
Depends on the definition of "work". If the IRQ doesn't get acknowledged automatically somehow (most of them don't), it'll result in an infinite loop. In a debug build the best way to handle the IRQ might be to display a fatal error message. However, if an unexpected IRQ happens in an program, you've got worse problems on your hands...
All I'm saying is people should understand that the RTI in the IRQ routine doesn't really accomplish that much more than having the IRQ vector point to $0000 (for example).
Quote:
Having IRQ pointing at an RTI will generally work. But in the NMI handler, you'll usually want to set a variable to notify the main program that a vertical blank has begun. So that I don't have to push and pull A, I just have it do inc nmis.
What and where is "nmis"? Does it simply count how many NMIs there's been? Is that all that needs fixed?
FinalZero wrote:
What and where is "nmis"?
It's a variable anywhere in RAM.
Quote:
Does it simply count how many NMIs there's been?
Yes. If you keep NMIs always enabled, you can wait for the next VBlank with code like this:
Code:
lda nmis
WaitVBlank:
cmp nmis
beq WaitVBlank
The loop basically waits for the variable to change. This works well in most cases, it's only bad if you have raster effects near the top of the screen (like status bars) and you can't guarantee that frame calculations will never spill into the next frame.
Quote:
Is that all that needs fixed?
Can't tell without seeing what else your program does.
Quote:
The loop basically waits for the variable to change. This works well in most cases, it's only bad if you have raster effects near the top of the screen (like status bars) and you can't guarantee that frame calculations will never spill into the next frame.
But where should this loop be located? in the reset code?
FinalZero wrote:
But where should this loop be located?
Wherever a wait for VBlank is necessary. The typical structure of a game loop is: 1. update the game world using controller input and A.I.; 2. wait for VBlank; 3. update VRAM using data computed last frame; 4. update the audio; 5. go back to 1;
The wait for VBlank not only makes sure that the VRAM updates will take place during VBlank, but it also syncs the game frames to the refresh rate of the console, effectively making the game run at a steady pace.
Quote:
in the reset code?
No, this has nothing to do with the reset.
Quote:
Wherever a wait for VBlank is necessary. The typical structure of a game loop is: 1. update the game world using controller input and A.I.; 2. wait for VBlank; 3. update VRAM using data computed last frame; 4. update the audio; 5. go back to 1;
How do I know when "wherever" is (for the vblank)?
Quote:
The wait for VBlank not only makes sure that the VRAM updates will take place during VBlank, but it also syncs the game frames to the refresh rate of the console, effectively making the game run at a steady pace.
So, this is my next step: Trying to display something on screen, preferably the numbers that I've successfully calculated.
I set the disable interrupt flag for the moment so I could test my math routines. They work! I successfully computed "14 9 7 + -". So, I want to ask, what kind of structuring do people usually use for procedures and macros? The couple routines I made so far are stack-based on the zero page. Is that feasible for a full program? What do NES games usually do? Does it depend whether it's a slow-paced RPG or a quick-paced action/adventure game?
Quote:
CODE: stored in ROM, used in ROM
RODATA: stored in ROM, used in ROM. Primary difference from CODE is that some architectures such as 65816 and 8086 allow for a separate program bank and data bank.
DATA: stored in ROM, copied to RAM in init code (which isn't written automatically for you), used in RAM
BSS: stored nowhere, cleared to zero* in init code, used in RAM
ZEROPAGE: like BSS but in $0000-$00FF
I see now that my understanding of the bss segment was wrong, though it makes sense now; It's simpler to simply remember how many null bytes to reserve in RAM instead of actually storing all of them in ROM.
Also, the zeropage can't have initial values? It's limited like the bss segment?
* * *
A couple of other thoughts:
1) FCEUX's debugger does something it shouldn't. When scrolling up, it scrolls up by byte, instead of instruction. Bytes that are part of multi-byte instructions are misrepresented as their own instructions until one scrolls up far enough so the program realizes that it's part of another instruction.
2) How/where are palettes stored? They're not stored in .chr files, are they? Is there an editor for them (palettes)?
3) I had something else that I was going to say, but I can't remember now. =/
* * *
One last thing: I want to thank everybody who's been so patient and helpful in answering my questions. You've been indispensable so far.
FinalZero wrote:
Quote:
Wherever a wait for VBlank is necessary. The typical structure of a game loop is: 1. update the game world using controller input and A.I.; 2. wait for VBlank; 3. update VRAM using data computed last frame; 4. update the audio; 5. go back to 1;
How do I know when "wherever" is (for the vblank)?
After the reset code finishes and the background is copied into the nametables, you start the game loop.
Quote:
Also, the zeropage can't have initial values? It's limited like the bss segment?
Correct. If you want initial values there you'll have to copy them yourself.
Quote:
1) FCEUX's debugger does something it shouldn't. When scrolling up, it scrolls up by byte, instead of instruction. Bytes that are part of multi-byte instructions are misrepresented as their own instructions until one scrolls up far enough so the program realizes that it's part of another instruction.
How would you predict how many bytes to scroll up? 6502 bytecode makes no self-synchronizing guarantee, unlike code in popular RISC architectures such as MIPS and ARM where every instruction is 16 or 32 bits long.
Quote:
2) How/where are palettes stored? They're not stored in .chr files, are they? Is there an editor for them (palettes)?
A palette is just a list of 32 bytes, ordinarily stored in PRG ROM and copied to $3F00 during vertical blanking. Some NES-specific tile editors have NES-specific palette editors, but some other tile editors have palette editors more suitable for an RGB system (PC, SNES, GBA) than for a hue-lightness system like that of the NES.
FinalZero wrote:
How do I know when "wherever" is (for the vblank)?
There are many things you can't (or shouldn't) do outside of VBlank, such as turning rendering on and off, writing to VRAM, setting the scroll, and so on. Before doing those things, you should wait for VBlank. In the game loop, you typically wait for VBlank once and then perform all the tasks you couldn't before.
Quote:
So, I want to ask, what kind of structuring do people usually use for procedures and macros? The couple routines I made so far are stack-based on the zero page. Is that feasible for a full program? What do NES games usually do? Does it depend whether it's a slow-paced RPG or a quick-paced action/adventure game?
I think you'll have to find your own style. Everyone's coding style is different, and although we could all tell you how we do things so that you can pick the way you prefer, I believe this would confuse you more than it would help! =)
Quote:
When scrolling up, it scrolls up by byte, instead of instruction.
It has no way to know the exact instructions you used in the source code, since it can only see the resulting binary, which can be interpreted in many ways. Scrolling by byte allows you to align the disassembly correctly, which is better than if it tried to guess the instructions and guessed wrong. For example, the command LDA $03A9 assembles to $AD $A9 $03, and LDA #$03 assembles to $A9 $03. When scrolling up, should FCEUX scroll 2 bytes (and see LDA #$03) or 3 bytes (and see LDA $03A9)? It doesn't know, both are valid instructions.
You could ague that it could search for the longest instruction in order to avoid interpreting operands as opcodes, but even then there will be problems. Say that I have a table whose last byte is $AD, followed by the LDA #$03 instruction. The disassembler will think it's a LDA $03A9 instruction, which is not the case.
So trust me, it's better that it scrolls 1 byte at a time. Is it annoying that we see a bunch of garbage instructions before we see what we actually coded? Yes, but it would be far more annoying if the disassembler guessed wrong and we couldn't see what we actually coded at all.
Quote:
How/where are palettes stored? They're not stored in .chr files, are they? Is there an editor for them (palettes)?
Depends on the programmer. Simple programs usually just have a list of 32 bytes representing all the colors, but more complex games might have smaller palettes that can be arranged differently.
Again, this is dependent on the coding style. The only important thing is that the colors are written to PPU address $3F00 during VBlank. Where the colors come from is completely dependent on game's architecture.
tokumaru wrote:
FinalBurn wrote:
When scrolling up, it scrolls up by byte, instead of instruction.
It has no way to know the exact instructions you used in the source code, since it can only see the resulting binary
Can't a debugging emulator see the starting addresses of previously executed instructions (from the Code/Data Logger) and use those as alignment anchors?
I'm pretty sure that there are workarounds (by "workarounds" I mean imperfect solutions that will still break under certain conditions), but is it really worth the trouble?
Quote:
After the reset code finishes and the background is copied into the nametables, you start the game loop.
Okay. Also, a question: So, if I permanently disable interrupts by setting the flag, I won't be able to update the screen, right?
Quote:
How would you predict how many bytes to scroll up? 6502 bytecode makes no self-synchronizing guarantee, unlike code in popular RISC architectures such as MIPS and ARM where every instruction is 16 or 32 bits long.
Quote:
It has no way to know the exact instructions you used in the source code, since it can only see the resulting binary, which can be interpreted in many ways. Scrolling by byte allows you to align the disassembly correctly, which is better than if it tried to guess the instructions and guessed wrong. For example, the command LDA $03A9 assembles to $AD $A9 $03, and LDA #$03 assembles to $A9 $03. When scrolling up, should FCEUX scroll 2 bytes (and see LDA #$03) or 3 bytes (and see LDA $03A9)? It doesn't know, both are valid instructions.
You could ague that it could search for the longest instruction in order to avoid interpreting operands as opcodes, but even then there will be problems. Say that I have a table whose last byte is $AD, followed by the LDA #$03 instruction. The disassembler will think it's a LDA $03A9 instruction, which is not the case.
So trust me, it's better that it scrolls 1 byte at a time. Is it annoying that we see a bunch of garbage instructions before we see what we actually coded? Yes, but it would be far more annoying if the disassembler guessed wrong and we couldn't see what we actually coded at all.
Ack, I suppose that you're right. There's no way for the program to know whether something is code or data.
Quote:
I think you'll have to find your own style. Everyone's coding style is different, and although we could all tell you how we do things so that you can pick the way you prefer, I believe this would confuse you more than it would help! =)
I'm not sure that a description of the structure would confuse me. I'm quite a bit more experienced than some kiddie programmer.
Quote:
A palette is just a list of 32 bytes, ordinarily stored in PRG ROM and copied to $3F00 during vertical blanking. Some NES-specific tile editors have NES-specific palette editors, but some other tile editors have palette editors more suitable for an RGB system (PC, SNES, GBA) than for a hue-lightness system like that of the NES.
What tile editor do you recommend? Are they any good NES-specific ones? I've used TileMolester and yychr, but each lacks capabilities that the other has, so neither is The One Tile Editor to Rule Them All.
I really don't want to write my own, yet the limited choices and capabilities is annoying...
FinalZero wrote:
Quote:
After the reset code finishes and the background is copied into the nametables, you start the game loop.
Okay. Also, a question: So, if I permanently disable interrupts by setting the flag, I won't be able to update the screen, right?
Not exactly. The vertical blank interrupt is an NMI, and NMI comes through whether the interrupt priority level is 0 or 1. SEI blocks only IRQ. You have to block NMI at the source, and bit 7 of PPUCTRL ($2000) controls the source. As long as bit 7 of PPUCTRL is turned on (LDA #$80 STA PPUCTRL), the following will work:
Code:
; $FFFA points here
.proc nmi
inc nmis
rti
.endproc
; your game loop calls this just before shoving stuff into VRAM
.proc wait4vbl
lda nmis
loop:
cmp nmis
beq loop
rts
.endproc
This .proc might confuse you. But it's just a way to hide the labels defined inside the .proc from view outside the .proc, so that multiple subroutines can share the same names for internal labels. For example, the symbol inside wait4vbl ends up called wait4vbl::loop, and other subroutines won't be defining symbols that start with "wait4vbl::".
Quote:
I'm quite a bit more experienced than some kiddie programmer.
In what other language or for what other platform have you made a video game? Perhaps I could help explain things with analogies to that platform.
Quote:
What tile editor do you recommend? Are they any good NES-specific ones? I've used TileMolester and yychr, but each lacks capabilities that the other has, so neither is The One Tile Editor to Rule Them All.
I just use GIMP to make my tile sheets and then run a Python program to convert .png to .chr.
Quote:
This .proc might confuse you. But it's just a way to hide the labels defined inside the .proc from view outside the .proc, so that multiple subroutines can share the same names for internal labels. For example, the symbol inside wait4vbl ends up called wait4vbl::loop, and other subroutines won't be defining symbols that start with "wait4vbl::".
Doesn't ca65 have nameless labels though? Isn't the syntax something like "beq +" or something?
Edit: it's "beq :+". Also, there's local labels, which start with '@'.
Quote:
In what other language or for what other platform have you made a video game? Perhaps I could help explain things with analogies to that platform.
I didn't mean video games, I meant programming/data structures in general. I've made plenty of programs, though I suppose that the only "game" I've made is a room-to-room text adventure sort of thing.
FinalZero wrote:
I'm not sure that a description of the structure would confuse me. I'm quite a bit more experienced than some kiddie programmer.
I didn't mean it in the sense that you wouldn't understand the explanations, it's just that everyone does things differently, so you could be overwhelmed by all the possibilities. Also, solving these kinds of problems is part of the learning experience, and through trial and error you'll find what works best for you.
I'm pretty sure that passing parameters through the stack is not very common on the 6502, because having to work around the return address is not very efficient. But if you feel comfortable using the stack that way, there's no reason for you not to.
Quote:
What tile editor do you recommend?
I don't recommend any. All tile editors I know of are just trying to reinvent the wheel and poorly implementing drawing features that already exist in much better drawing programs. Like tepples, I like to draw sprites in an actual drawing program (I use both MS Paint and GIMP) and just convert them to the NES format when done.
Quote:
I'm pretty sure that passing parameters through the stack is not very common on the 6502, because having to work around the return address is not very efficient. But if you feel comfortable using the stack that way, there's no reason for you not to.
I'm not using *the* stack (the one at $100 to $1FF), I'm using a stack structure on the zero page, with the x register as the index of the current empty spot. One can then use the zeropage, x addressing mode (always as 0, x) to retrieve and do things with the values.
Quote:
I just use GIMP to make my tile sheets and then run a Python program to convert .png to .chr.
Quote:
I don't recommend any. All tile editors I know of are just trying to reinvent the wheel and poorly implementing drawing features that already exist in much better drawing programs. Like tepples, I like to draw sprites in an actual drawing program (I use both MS Paint and GIMP) and just convert them to the NES format when done.
To be honest, I've never used GIMP extensively. I use mspaint or Paint.Net most of the time. Is there a common program that can convert an image to .chr format?
* * *
Edit: So, I've implemented adding the "nmis" code to my code, but it doesn't seem to fix anything. My code still gets stuck going between places and not adding/subtracting stuff. It seems to be stuck in this:
Code:
loop:
cmp nmis
beq loop
Ideas?
FinalZero wrote:
Is there a common program that can convert an image to .chr format?
There are several. My build environment uses a lot of Python programs, so I wrote my own. It's in the
Concentration Room source code archive.
Quote:
Code:
loop:
cmp nmis
beq loop
Ideas?
What have you written to PPUCTRL ($2000)? You could try putting a breakpoint on the NMI handler to make sure it gets called.
FinalZero wrote:
To be honest, I've never used GIMP extensively. I use mspaint or Paint.Net most of the time. Is there a common program that can convert an image to .chr format?
YY-CHR lets you paste images from the clipboard (up to 128x128), and it is converted to CHR data immediately on paste. It works best if the image is in indexed color format, so make sure your paint program lets you specify it. Photoimpact is really good for this task, since it lets you do color reduction and re-ordering the palette the way you want to, I'm not as experienced with other paint programs. Truecolor images often get mangled when pasting into YY-CHR, because it can't match the colors to an index correctly.
Quote:
There are several. My build environment uses a lot of Python programs, so I wrote my own. It's in the Concentration Room source code archive.
Okay, thanks.
Quote:
What have you written to PPUCTRL ($2000)? You could try putting a breakpoint on the NMI handler to make sure it gets called.
I'm not writing anything there. =/
Quote:
YY-CHR lets you paste images from the clipboard (up to 128x128), and it is converted to CHR data immediately on paste. It works best if the image is in indexed color format, so make sure your paint program lets you specify it. Photoimpact is really good for this task, since it lets you do color reduction and re-ordering the palette the way you want to, I'm not as experienced with other paint programs. Truecolor images often get mangled when pasting into YY-CHR, because it can't match the colors to an index correctly.
Okay, thank you.
FinalZero wrote:
Quote:
What have you written to PPUCTRL ($2000)? You could try putting a breakpoint on the NMI handler to make sure it gets called.
I'm not writing anything there. =/
Your NMI couting code is useless if you are not initializing $2000. Initializing the PPU by writing to $2000 and $2001 at least once is a mandatory thing for NES programs.
tokumaru wrote:
FinalZero wrote:
Quote:
What have you written to PPUCTRL ($2000)? You could try putting a breakpoint on the NMI handler to make sure it gets called.
I'm not writing anything there. =/
Your NMI couting code is useless if you are not initializing $2000. Initializing the PPU by writing to $2000 and $2001 at least once is a mandatory thing for NES programs.
I'm doing something wrong. The wiki pages aren't very helpful either.
http://wiki.nesdev.com/w/index.php/PPU_power_up_state
http://wiki.nesdev.com/w/index.php/Init_code
My code:
Code:
.proc reset
; Clears the flags.
clc
cli ; sei
clv
; Sets the stack pointer.
ldx #$FF
txs
stx PPU_CONTROL
stx PPU_MASK
jmp main
Code:
.proc main
jsr read_input
jsr vert_blank
jsr write_video
jsr write_audio
jmp main
.endproc
.endproc
You're supposed to use SEI on start up, not CLI. CLI allows interrupts to happen, and you don't want that during initialization. After initializing everything, you should only enable interrupts (CLI) if you actually use them.
You're supposed to write $00 to the PPU registers, not $FF. Also, you have to give the PPU some time to warm up (2 frames is enough). Usually we poll the VBlank flag in $2002 for this (actually this is the only time when using $2002 to wait for VBlank is acceptable). After the warm up you can configure the PPU as you wish (this is when you enable NMIs). If you don't let the PPU warm up, it behaves erratically.
Quote:
You're supposed to use SEI on start up, not CLI. CLI allows interrupts to happen, and you don't want that during initialization. After initializing everything, you should only enable interrupts (CLI) if you actually use them.
What do you mean "if you actually use them"? I thought I just read that they're used to draw to the screen.
Quote:
You're supposed to write $00 to the PPU registers, not $FF. Also, you have to give the PPU some time to warm up (2 frames is enough).
How long is a "frame"?
Quote:
Usually we poll the VBlank flag in $2002 for this (actually this is the only time when using $2002 to wait for VBlank is acceptable). After the warm up you can configure the PPU as you wish (this is when you enable NMIs). If you don't let the PPU warm up, it behaves erratically.
What does "poll" mean?
FinalZero wrote:
Quote:
You're supposed to use SEI on start up, not CLI. CLI allows interrupts to happen, and you don't want that during initialization. After initializing everything, you should only enable interrupts (CLI) if you actually use them.
What do you mean "if you actually use them"? I thought I just read that they're used to draw to the screen.
I think you're confusing things, but it's normal.
NMIs are very simmilar to IRQs, but on the NES it is specifically wired to the PPU and, when activated, fires when VBLank begins, basically. Code at the NMI vector is then executed, and then you can
update the tiles, sprites and other stuff (or just set a flag, sometimes).
IRQs are often mapper-specific (we exclude here DMC and frame IRQs which come from the APU part of the 2A03). In a 4-way scrolling game (ex:Super Mario Bros. 3), IRQs (from the MMC3 mapper) are used to
set the scroll somewhere within a frame, so to perform a 4-way scrolling effect, to give an example.
So as you can see, NMIs and IRQs look similar but they're differents things. And I can expect that other members will emphatize the difference between IRQs and NMIs
FinalZero wrote:
Quote:
You're supposed to write $00 to the PPU registers, not $FF. Also, you have to give the PPU some time to warm up (2 frames is enough).
How long is a "frame"?
Approximatively 1/60 seconds (NTSC) or 1/50 (PAL), which is the time a
frame appears on your TV. Assume your TV displays 60 frames per sec. on a NTSC system.
FinalZero wrote:
Quote:
Usually we poll the VBlank flag in $2002 for this (actually this is the only time when using $2002 to wait for VBlank is acceptable). After the warm up you can configure the PPU as you wish (this is when you enable NMIs). If you don't let the PPU warm up, it behaves erratically.
What does "poll" mean?
It means basically to simply look at a [software/hardware] flag/condition, normally periodically. You can see "by polling" the alternative of "by interrupts". Normally, "by interrupts" is better, but because, at power on, the PPU isn't reliable, it's better to "poll" the state of the VBlank flag.
I know ~J-@D!~ answered already, but I too want to give it a go.
FinalZero wrote:
What do you mean "if you actually use them"? I thought I just read that they're used to draw to the screen.
NMIs are different from IRQs. NMIs are the ones that fire when VBlank starts and allows you to sync the rogram to the refresh rate, IRQs are used for things like raster effects (IRQs generated by mappers) and detecting when DPCM samples have finished playing. Many games (specially the ones with mappers older than the MMC3) don't use IRQs at all.
Quote:
How long is a "frame"?
In this context, it doesn't matter.
Quote:
What does "poll" mean?
It means you constantly read a register until the returned value changes. In the case of $2002, the most significant bit tells you when VBlank starts, so you can read it in a loop and wait for the N flag (which is a copy of bit 7) to be set, like this:
Code:
Wait:
lda $2002
bpl Wait
While the flag is clear, you keep waiting. Do that twice and you'll have waited 2 frames. This is why I said that the length of the frame doesn't matter in this case.
Quote:
I think you're confusing things, but it's normal.
NMIs are very simmilar to IRQs, but on the NES it is specifically wired to the PPU and, when activated, fires when VBLank begins, basically. Code at the NMI vector is then executed, and then you can update the tiles, sprites and other stuff (or just set a flag, sometimes).
IRQs are often mapper-specific (we exclude here DMC and frame IRQs which come from the APU part of the 2A03). In a 4-way scrolling game (ex:Super Mario Bros. 3), IRQs (from the MMC3 mapper) are used to set the scroll somewhere within a frame, so to perform a 4-way scrolling effect, to give an example.
So as you can see, NMIs and IRQs look similar but they're differents things. And I can expect that other members will emphatize the difference between IRQs and NMIs
Quote:
NMIs are different from IRQs. NMIs are the ones that fire when VBlank starts and allows you to sync the rogram to the refresh rate, IRQs are used for things like raster effects (IRQs generated by mappers) and detecting when DPCM samples have finished playing. Many games (specially the ones with mappers older than the MMC3) don't use IRQs at all.
I was confusing them. Thank you for the explanations.
Quote:
Code:
Wait:
lda $2002
bpl Wait
While the flag is clear, you keep waiting. Do that twice and you'll have waited 2 frames. This is why I said that the length of the frame doesn't matter in this case.
And this happens at the very beginning of one's program?
Yes. The spinning happens twice, usually about a dozen instructions into a program.
FinalZero wrote:
And this happens at the very beginning of one's program?
It should be right after SEI, CLD, stack initialization, mapper initialization (if applicable) and PPU and APU resetting.
Okay, so I've meed those changes, and have this now:
Code:
; Inits everything.
.proc reset
; Clears the flags.
clc
sei
cld
clv
; Sets the stack pointer.
ldx #$FF
txs
; Waits for the PPU to warm up.
: lda $2002
bpl :-
; Inits the PPU.
inx
stx PPU_CONTROL
stx PPU_MASK
dex
jmp main
.endproc
However, my code still gets stuck in the NMI handler. I don't even understand how it's supposed to escape it. It continually compares nmis to AC, neither of which are programmed to change in the loop.
Code:
lda nmis
: cmp nmis
beq :-
You need to actually enable the NMI.
Read this:
http://wiki.nesdev.com/w/index.php/PPU_registers
Code:
inx
stx PPU_CONTROL
stx PPU_MASK
What you did there is write 0 to both of those registers. Which means no NMI interrupt happens at the start of each frame. (Because bit 7 of $2000 is clear)
So after your setup code is done, set bit 7 of $2000. What this causes to start happening is an NMI "interrupt". An interrupt actually interrupts whatever your program was doing before the interrupt, and starts running something else. When that "something else" is done, it returns to where it was before the interrupt took place.
This is how nmis will change, even though your current loop does nothing to change it.
So while your main loop is doing this:
Code:
lda nmis
: cmp nmis
beq :-
Eventually, a new frame will start. When that happens, your code will be interrupted and jmp to your NMI routine. Your NMI routine will change the variable nmis, and then return allowing your main loop to continue down.
I don't know how to set the NMI vectors with your current setup, but you need to put the address for the NMI routine directly before the address of your reset vector.
Edit: Here is a quick NMI to get you started that should at least get you out of that loop:
Code:
nmi:
pha;The interrupt might have happened when you were doing something important
tya;So we save the registers to the stack
pha;And restore them later so that when it
txa;Returns, the same values will be in them.
pha
inc nmis
pla;Restoring the registers
tax;From the stack.
pla
tay
pla
rti;Return from interrupt
Okay, so like this?:
Code:
; Inits everything.
.proc reset
; Clears the flags.
clc
sei
cld
clv
; Sets the stack pointer.
ldx #$FF
txs
; Waits for the PPU to warm up.
: lda $2002
bpl :-
; Inits the PPU.
ldx #%10000000
stx PPU_CONTROL
ldx #%00011000
stx PPU_MASK
; Sets the stack pointer again.
ldx #$FF
jmp main
.endproc
You still want 0 written to both of those PPU registers while the PPU is still warming up. Also, you should wait two frames for the PPU to warm up, not just one.
More like this?
Code:
; Inits everything.
.proc reset
; Clears the flags.
;clc;This isn't needed in your setup code. Its state being unknown
;Doesn't affect how your code will work in the very beginning
sei;
cld;We only do this because NES doesn't actually have decimal mode.
;clv;No need to do this either.
; Sets the stack pointer.
ldx #$FF
txs
inx
stx PPU_CONTROL
stx PPU_MASK
; Waits for the PPU to warm up
: lda $2002
bpl :-;Frame 1
; Waits for the PPU to warm up.
: lda $2002
bpl :-;Frame 2
; Inits the PPU.
ldx #%10000000
stx PPU_CONTROL
ldx #%00011000
stx PPU_MASK
jmp main
.endproc
And you also need to put the NMI routine I posted in your code.
I don't use what you're using, but it looks like the way to do it is this (from source code Tepples recommended looking at earlier in the topic):
Code:
.segment "VECTORS"
.addr nmi, reset, irq
Then:
Code:
.proc irq;Sure, here's an IRQ too for good measure
rti
.endproc
.proc nmi
pha;The interrupt might have happened when you were doing something important
tya;So we save the registers to the stack
pha;And restore them later so that when it
txa;Returns, the same values will be in them.
pha
inc nmis
pla;Restoring the registers
tax;From the stack.
pla
tay
pla
rti;Return from interrupt
.endproc
One other thing:
Code:
; Sets the stack pointer again.
ldx #$FF
That doesn't do anything with the stack pointer. Not that it matters. You already set the stack pointer at the beginning of your program. Once you transfer a number from X to the stack pointer (they are not the same thing) with TXS, you can use X for whatever you like. If you're not going to use the #$FF in X you loaded right then for something, there's no need to do it.
In short:
1. Changing X won't change the stack pointer unless you use TXS.
2. You don't need to change the stack pointer again since you set it up already.
FinalZero, I must say it looks like you are just guessing and hoping for things to somehow work out, instead of actually understanding what you're doing.
Quote:
You still want 0 written to both of those PPU registers while the PPU is still warming up.
But I thought I was writing to them after the PPU has already warmed up.
Quote:
Also, you should wait two frames for the PPU to warm up, not just one.
Oops, it's fixed now.
Quote:
And you also need to put the NMI routine I posted in your code.
I don't understand yours. You push a bunch of stuff onto the stack only to pop it off in the very same routine after only incrementing nmis.
Quote:
That doesn't do anything with the stack pointer. Not that it matters. You already set the stack pointer at the beginning of your program. Once you transfer a number from X to the stack pointer (they are not the same thing) with TXS, you can use X for whatever you like. If you're not going to use the #$FF in X you loaded right then for something, there's no need to do it.
I know. My math routines use a stack based on the zeropage with x as the index register, as its stack pointer. That's what I meant by "stack pointer" here.
Quote:
FinalZero, I must say it looks like you are just guessing and hoping for things to somehow work out, instead of actually understanding what you're doing.
That's sort of how I feel. I *think* I understand it, and then later I find out that I was wrong. Anyways, I've certainly never programmed something where I have to take care of things like vblanks and interrupts before. I admit, lots of it seems cryptic. I apologize if it seems like I'm wasting your time.
FinalZero wrote:
Quote:
And you also need to put the NMI routine I posted in your code.
I don't understand yours. You push a bunch of stuff onto the stack only to pop it off in the very same routine after only incrementing nmis.
Hail John Frum, who brings the cargo.
There are three structures for a game loop:
- NMI handler just increments nmis. Main loop waits for increment, updates VRAM, updates sound, then runs one frame of game logic.
- NMI handler updates VRAM and then updates sound, then increments nmis. Main loop waits for increment then runs one frame of game logic.
- NMI handler updates VRAM, updates sound, then runs one frame of game logic.
B and C type NMI handlers generally require pushing all registers and pulling them at the end. Some people are so used to B or C type NMI handlers that they end up typing out the pushes by habit and forgetting that they're not strictly needed for A type NMI handlers. Forgive them.
Quote:
Quote:
FinalZero, I must say it looks like you are just guessing and hoping for things to somehow work out, instead of actually understanding what you're doing.
That's sort of how I feel. I *think* I understand it, and then later I find out that I was wrong. Anyways, I've certainly never programmed something where I have to take care of things like vblanks and interrupts before. I admit, lots of it seems cryptic. I apologize if it seems like I'm wasting your time.
You could try a new thread about starting from my project template and hacking around with it.
FinalZero wrote:
But I thought I was writing to them after the PPU has already warmed up.
It goes like this: before the warm up, write 0 to both PPU registers (the purpose here is to "reset" the PPU... kinda), and after the warm up you write the actual configuration you're gonna use (this includes enabling NMIs).
FinalZero wrote:
Quote:
You still want 0 written to both of those PPU registers while the PPU is still warming up.
But I thought I was writing to them after the PPU has already warmed up.
I apologize for my mistake. I always thought those registers started with an unknown state, which isn't true. Even if they did start at something unknown, writes to them are also totally ignored for the first few frames.
What tokumaru posted is what I was describing, but
the wiki says you don't need to.
I'll probably still do it to be safe, because I am paranoid.
Quote:
I don't understand yours. You push a bunch of stuff onto the stack only to pop it off in the very same routine after only incrementing nmis.
Exactly. I explained why it does that in the comments, but Tepples makes a good point. The NMI I posted currently
doesn't affect the status registers, but the idea would be that you add on to it once you're passed this hurdle.
Here's a real world example of why the pushing and popping is done.
Imagine this code is in your main routine:
Code:
lda #$FF
ldx #$00
loop:
sta $0200,x
inx
inx
inx
inx
;*
bne loop
;We can continue only when X is exactly 0. I've used code like to remove all sprites from the screen (by putting their y position below the screen).
Now imagine this is your nmi routine. It interrupts at some point during the loop. Say... where that * is.
Code:
nmi:
inc nmis
lda #$20;This line will cause a #$20 to be written instead of #$FF
sta $2006
ldx #$03;This line will cause an infinite loop.
sta $2006
ldy #$00
sta $2007
;
rti
When that code returns you're in an infinite loop, because the nmi changed X to something that makes completing the loop impossible.
Can (4*Z+3)%256 ever equal 0?
(Z = the number of times we have looped)
No. But (4*Z+0)%256 can be 0 which would allow your main loop to continue.
We push the registers to the stack in the NMI because it can interrupt our code at ANY time. After they're pushed, we can safely change them in the NMI, and restore them when we're done for when we return.
The inc nmis part might make sense now too. Because the opposite happens than this example. The nmi not changing that variable KEEPS us in an infinite loop when it should break us out of one.
Is that clear? If not I'll try a different approach. Tepples also did an explanation of game loops where you can avoid pushing and pulling if you like.
Quote:
I know. My math routines use a stack based on the zeropage with x as the index register, as its stack pointer. That's what I meant by "stack pointer" here.
My bad. I'm a new guy in this thread, so I might have missed where that was mentioned. I'm just trying to make sure you're not doing anything you don't understand.
Kasumi wrote:
What tokumaru posted is what I was describing, but
the wiki says you don't need to.
Perhaps you need to on the Famicom, which doesn't fully reset the PPU when the player presses reset.
When coding in assembly I like to assume that everything is in an unknown state on power up. Better safe then sorry.
Then, I was right the first time! Like I said, I'd do it anyway.
But now a question for my own curiosity: Are writes to $2000 still ignored for a few frames after a famicom reset? If so, even if $2000 and $2001 were set to something potentially harmful, you still couldn't do anything about it before the PPU was warmed up anyway.
Kasumi wrote:
But now a question for my own curiosity: Are writes to $2000 still ignored for a few frames after a famicom reset? If so, even if $2000 and $2001 were set to something potentially harmful, you still couldn't do anything about it before the PPU was warmed up anyway.
Going out on a limb here, but I don't think so. As far as I understand, the reset line of the PPU simply isn't connected to the PPU on the Famicom, so when you push the reset button, PPU keeps running like nothing happened.