If you search for tutorials on how to program the NES in C, the examples either aren't very clear, or contain poor examples of programming the NES.
Well, you can include my tutorial / blog to that list.
https://nesdoug.wordpress.com/This was the super top-secret other project that I've been working on for a month. It has lots of example code for programming the NES in cc65.
And, a tiny example game (vertical space shooter).
If anyone sees any factual errors, or broken links, unclear code, or just plain bad advice...please let me know. I'm hoping that someone out there can use this to get started on NES game development.
Thanks.
One thing that I found really nice in thefox's KNES is this macro:
Code:
#define PPU_ADDR(a) \
do { \
PPU.addr = HIBYTE( a ); \
PPU.addr = LOBYTE( a ); \
} while ( 0 )
Proofreading, there's several apostrophe mistakes.
Quote:
apostrophe mistakes
.
Huh?
Apostrophes:
Post 0, paragraph 11, "mirroring) it’s map" → its (is possessive, not a contraction)
Post 10, paragraph 8, "referenced it's address" → its
Post 10, paragraph 17, "for it’s collision" → its
Post 13, paragraph 3, "the other one’s." → ones (is plural, not possessive nor a contraction)
Post 13, paragraph 13, "with it’s logic" → its
(
http://blog.writeathome.com/index.php/2 ... -teachers/ )
Anyway, of what I read so far, it looks like you explain the NES architecture in a very clear, simple way.
Thanks for the tutorial! I will check it out.
I'd recommend mentioning in "Getting Started" that this is geared towards Windows development. It becomes evident pretty quickly, but this could be a stumbling block for beginners.
Have only skimmed over this so far, but it looks pretty neat! Looking forward to reading the whole thing in depth.
That's so weird...
I have a little Stats menu on my blog, and I got like 9000 views yesterday, most from a referral on Reddit by zeroone. Those are almost monetizable numbers... (according to some website, if that traffic was sustainable, I could make $20 a day, or $7-8000 a year off ad revenue.)
My most popular YouTube video only got like 100 views all year.
Actually, that's almost enough to get my dropbox account suspended. Google tells me there's a 10gb bandwidth per day limit on public downloads from a free dropbox account. Hmm.
@dougeff Awesome stuff. Blogs like this tend to be the catalyst for future projects.
Wow, they really tore me a new one over at Reddit (ie, didn't like the way I write code.)
https://www.reddit.com/r/programming/co ... game_in_c/Should I edit the hell out of my example code (to make it look pretty) and add a hundred comments? I don't even know what all these things mean...
"no grouping of globals), no arrays for groups of bullets (he instead has Bullet1x,Bullet1y, Bullet2x, Bullet2y, etc.), no structs, CamelCase, magic number galore, repeated code due to the Something1...SomethingN convention, commented-out code"
What the heck is a CamelCase?
Magic numbers...I get that, I should replace with constants that describe what they are.
Commented-out code...because it was redundant, or a removed debugging element.
And, I don't know why, but some other website on cc65 (??) said 'don't use structs'...so I didn't.
Grouping of globals...they look ok to me. Am I missing something?
You appear to be referring to the comment by _georgesim_.
CamelCase is "HelloWorld" as opposed to something like "hello_world". Different people prefer different styles. CamelCase has the connotation that you're trying to be object-oriented.
"Arrays for groups of bullets" would have been something like this:
Code:
BulletX: .res MAX_BULLETS
BulletY: .res MAX_BULLETS
BulletDX: .res MAX_BULLETS
BulletDY: .res MAX_BULLETS
That at least allows efficient indexing. I explained why structs are dispreferred on 6502.
Most programming stuff I've seen has used the convention of camelCase instead of CamelCase.
dougeff wrote:
What the heck is a CamelCase?
If someone wants to pick an argument about an arbitrary naming convention like CamelCase, camelCase, underscores, or however you've chosen to do it, I would highly recommend that you ignore them. There is very little productive argument to be had about this. It's generally just a foolish waste of time.
As long as you're consistent with your conventions, it's probably fine.
Quote:
As long as you're consistent with your conventions, it's probably fine.
I would agree with this. Putting multiple conventions together causes confusion and often causes one to doubt whether or not they understand what they're looking at. Camel case isn't really any better or worse than other naming conventions (I personally use a lot of dot notation for things); just as long as it's clear what it means, and is consistent with how other things are named.
Your blog made front-page on Hacker News!
Do you have any plans on discussing PRG bankswitching techniques from a C context?
Quote:
PRG bankswitching ?
Maybe with MMC3.
And, I plan to add a DMC music/sfx example.
And, I want to do a much better SMB3/Metroid bi-directional platformer with better physics.
mikejmoffitt wrote:
Your blog made front-page on Hacker News!
link
mikejmoffitt wrote:
Your blog made front-page on Hacker News!
Do you have any plans on discussing PRG bankswitching techniques from a C context?
Done.
http://nesdoug.com/2016/01/15/24-mmc3-b ... hing-irqs/Any more requests?
There are some problems with the example:
- You should save/restore registers in your IRQ handler. But in fact when writing IRQ handlers in C, this is not enough, because the C runtime library's (fairly large) state should also be saved. In your example you got lucky (without "-Oi" switch the program doesn't work). In addition some of the runtime library routines are not reentrant. See
http://www.cc65.org/faq.php#IntHandlers- Your example is placing the runtime library code in one of the switchable banks ("CODE"). If any code is placed in one of the other switchable banks (e.g. "CODE1"), and the compiler emits a call to the runtime library routines, you'll get a crash. It'd be better to place the runtime library stuff in the fixed bank, which you can do by simply calling the fixed segment "CODE". (NOTE: In this case the "-Oi" switch and the simplicity of the example causes there to be no runtime library code emitted in the ROM, but this is not something that can be relied on.)
These are kind of annoying problems, because your program might appear to work fine without taking them into consideration, and then stop working one day when the code is modified. Thus, careful testing is needed to make sure that everything works as it should.
Thanks for the response. I hadn't considered the C runtime library being in a non-fixed bank. I'll have to change the example code.
So far, I've only made small technical demos with MMC3. So, I haven't run into any errors...that will certainly happen with more complex code.
Oh, and I see from the link you posted...
Quote:
I do usually suggest to think about writing the interrupt handler completely in assembler
.
I was hoping to write the IRQ code in C, so people could read it more easily. But, I guess it should probably be written in ASM, for stability. Otherwise, I would have to copy all the zeropage C variables, at the start of the IRQ and replace them at the end of the IRQ.
dougeff wrote:
I was hoping to write the IRQ code in C, so people could read it more easily. But, I guess it should probably be written in ASM, for stability. Otherwise, I would have to copy all the zeropage C variables, at the start of the IRQ and replace them at the end of the IRQ.
Yeah. An additional problem with IRQ handlers written in C is that you can't rely on the timing of the C code very much.
...
(Random idea: It would be possible to provide a sort of a generic IRQ handler (written in assembly) for the common tasks like setting the scroll and switching the CHR bank. Then one could simply set some variables e.g. scroll and the new CHR bank from C code, and maybe some bitflags to indicate what values were set. Assembly code would then pick up the values and take care of the rest. Might be a reasonable idea for a game engine, but maybe not so good for a tutorial.)
It occurred to me, that all of my example codes in my tutorial will have problems if the logic runs longer than 1 frame...Here's what they all do...
BIG MAIN LOOP {
-wait for NMI-flag
-then update the PPU
-then set the scroll
-then logic
}
NMI {
-set NMI-flag
}
This runs fine for short games / demos. And, allowed me to write nearly everything in C.
but, if the logic ran too long, it might be outside of the next V-blank by the time it starts updating the PPU and setting the scroll, etc. --misaligning the screen every time lag happens.
How would you guys address this problem? Would you have the PPU update, Sprite update, set scroll stuff in the NMI handler? And maybe music (because laggy music is annoying).
What if I put a test for the NMI-flag at the end of the MAIN() loop, and if it has already been set, reset it to zero, so that it waits for yet another NMI to occur before. The screen will never misalign, but lag would be more frequent (maybe.)
if "set NMI flag" means you increment a counter, and "wait for NMI flag" means you read the counter and spin until it changes, then there should be no problem with scroll corruption or anything; a long frame should just result in slowdown.
My own recommendation is:
1. Put all PPU updates in NMI handler.
2. Guarantee that the PPU updates can't run longer than vblank.
3. Create a variable that controls whether the PPU update happens (i.e. "update ready"). Item 1 should only run when a new set of updates are ready.
4. Run music in NMI handler too, this means that music is not subject to slowdown.
If you have a scroll split, or other raster effect timings to make, ideally these should also be done in the NMI handler, or in an IRQ that's set up by the NMI handler. Basically the idea is that the NMI has reliable timing (at least until it gets to the music), and the main thread does not. The main thread prepares a frame worth of updates (taking as long as it needs), then uses that variable (item 3) to signal the NMI thread to send it to the PPU.
I would even go as far as to leave the NMI on always, so that music can play smoothly during screen transitions, etc. Because NMI PPU updates only happen on request, you can do stuff like load nametables in the main thread safely. (Also, only ever turn rendering on or off in the NMI handler, so you never get a partial screen.)
I added a tutorial on 'Importing a MIDI file to Famitracker'...because I still can't find much information on this subject, I made up my own technique. My method is very specific to the program REAPER that I prefer to use. If anyone has a better method, please let me know - or any specifics on how to do this with any other software, thanks.
http://nesdoug.com/2016/01/24/25-import ... mitracker/
dougeff wrote:
I added a tutorial on 'Importing a MIDI file to Famitracker' [...] If anyone has a better method, please let me know - or any specifics on how to do this with any other software, thanks.
When you record form a keyboard, I agree it's a good idea to quantize the notes. I also agree keeping the MIDI channels monophonic is a requirement to get meaningful results in the FamiTracker import.
I don't use Reaper, so I'm not sure how Reaper's "project tempo" and "MIDI track tempo" interact. Can you tell more about why you set the project tempo to 12 BPM before recording, then later set the project tempo to 100 BPM and the MIDI track tempo to 125 BPM? In my import tests, I didn't need to modify the MIDI file tempo, but I did modify the MIDI file resolution (MIDI ticks per quarter note).
I took a look at FamiTracker 0.4.2's MIDI import source code. There are some flaws and limitations in the MIDI import, so there's never really going to be an easy way to use it. Here are some explanations of the flaws and limitations and alternative ideas I've come up to work around them until a better MIDI import process is found.
Terminology notesMIDI file ticks and module ticks are different concepts. For clarity, I call them "MIDI ticks" and "module ticks".
In a MIDI file, the MIDI tick is the smallest time unit you can use to specify the position of an event. The MIDI file resolution is the number of MIDI ticks per quarter note the MIDI file uses, for example 240 MIDI ticks per quarter note or 960 MIDI ticks per quarter note or other values.
In a module file, the smallest time unit you can enter notes with is a row, but effects like arpeggios work on smaller units of module ticks. The module Speed value sets the size of a row in module ticks.
Also, MIDI file tempo (quarter notes per minute) and module Tempo (double-dozen module ticks per minute) are different concepts.
When the module Speed value is the default 6 and a "beat" is 4 rows, then the module Tempo value happens to be in beats per minute. Otherwise, the module Tempo value is really in double-dozen module ticks per minute, and you have to calculate what module Tempo value to set to get the "real tempo" you want. Additionally, the FamiTracker status bar shows a "real tempo" value in beats per minute by using the "Row highlight, 1st" value in the toolbar as the size of a beat in rows.
FamiTracker 0.4.2 MIDI import flaws, limitations, and workaroundsIn the MIDI protocol, there are two alternative ways to specify the end of a note: use a Note Off event or use a Note On event with velocity zero. Unfortunately, FamiTracker's MIDI import incorrectly considers a Note On event with velocity zero as a normal Note On event.
If you are using a MIDI editor that specifies the end of notes using a Note On event with velocity zero, here is one way to work around that. Drag the end of each note up to the beginning of the next note, so there are no gaps between the notes. During the import, the end of the each note will be converted as a new note, but then get overwritten by the beginning of the next note.
In every MIDI file track, FamiTracker removes any silent part before the first note. In other words, each track is shifted so its first note begins at row 0.
If your MIDI file tracks have first notes that begin at different times, try one of these options:
- Convert the MIDI file to format 0. This creates a MIDI file with everything in one track.
- In each track that has a delayed start, put a dummy note so that all tracks have a note at the same starting point. After the import, erase the dummy notes in the module.
FamiTracker converts a note's position from MIDI ticks into module rows by considering each module row as 24 MIDI ticks in size.
One way to control the MIDI note size to module row size conversion is to use a MIDI editor that lets you change the MIDI file resolution (MIDI ticks per quarter note). As a typical suggestion, if you want a quarter note to be 4 rows, you would change the MIDI file resolution to 96 MIDI ticks per quarter note.
FamiTracker incorrectly sets the number of frames in the song to 1 less frame than needed.
After the import, increase the Frames value by 1 to show the last imported frame.
FamiTracker sets the module Speed and Tempo to strange values.
If you set the MIDI file quarter note to be 4 rows, then I recommend changing the Speed to 6 and changing the Tempo to the same tempo value the MIDI file uses (in quarter notes per minute).
FamiTracker creates a strange instrument.
Delete the strange instrument and create new instruments for each channel.
FamiTracker ignores Note Off events in the MIDI file.
Depending on the final instrument you use, each note will continue sounding up to the start of the next note. If you don't want that, you'll need to manually create note cuts or releases, or use an instrument with a volume that fades out.
MIDI editor SekaijuSekaiju is a MIDI editor for Windows you can use to convert a MIDI file to format 0 and change its resolution.
Sekaiju defaults to Japanese. To change to English:
1. Go to the (S) menu and choose Language.
2. Set "User Interface" to English and "Text Encoding" to "1252-Western Latin-1 [recommended]". (Or if you know you want to interpret the text in a MIDI file using another encoding, choose the encoding you want.)
3. Restart Sekaiju.
After you open a MIDI file in Sekaiju, here's how to convert the format and resolution:
1. Go to the File menu and choose Property.
2. In the SMF Format section, choose SMF Format 0.
3. In the Time Mode Resolution section, "TPQN Base (recommended)" should already be selected. My default suggestion is to change the Resolution value to 96 Ticks / Quarter Note.
Thanks for the tutorials, Doug. I'm developing in assembly and on Linux (which requires liberal use of Wine to run a lot of the tools), but your sample code was exactly what I needed to get started. I ended up only using your HelloWorld example as a base for my code, but I've read the whole series and it's been very helpful. Keep up the good work!
Quote:
Can you tell more about why you set the project tempo to 12 BPM before recording, then later set the project tempo to 100 BPM and the MIDI track tempo to 125 BPM?
I'm still working on ways to import MIDI to Famitracker. My first tests, with BPM at 120 in Reaper, import with WAY too many ticks per note. So, I did a few tests, recording with BPM (in Reaper) set to 12 (while I play the notes at 120 BPM) gets it very close to what I want. With 5 ticks per note in Famitracker. Changing the project BPM to 100 and then forcing the MIDI track to 125 (125/100 = 5/4)...imports with 4 ticks per note.
Obviously, this isn't ideal, and needa refined.
The famitracker file will still need LOTS of editing (adding note ends, volume column, etc).
Thanks for the input, Bavi_H.
@russellsprouts
Thanks for the positive comments. BTW, there's an error in my button pressing code, I'll update as soon as I get a chance (maybe next week). Sorry.
dougeff wrote:
My first tests, with BPM at 120 in Reaper, import with WAY too many ticks per note. So, I did a few tests, recording with BPM (in Reaper) set to 12 (while I play the notes at 120 BPM) gets it very close to what I want. With 5 ticks per note in Famitracker. Changing the project BPM to 100 and then forcing the MIDI track to 125 (125/100 = 5/4)...imports with 4 ticks per note.
To make sure we're talking about the same thing, perhaps you mean "rows" instead of "ticks"? (That is, your 1st test yielded many rows per note, 2nd test 5 rows per note, and 3rd test 4 rows per note?) In FamiTracker and other trackers, rows are what you see on the screen -- each line is a row. Ticks are sub-units of a row that you hear when you use arpeggios and other effects commands in the last three character columns of each channel. (In FamiTracker, go to the Help menu and choose "Effect table" to see the effects commands.)
[Note: Everywhere else in my comments here, I'm saying "module ticks" to mean ticks in FamiTracker and "MIDI ticks" to mean ticks in a MIDI file, because these two kinds of ticks do not correspond to each other.]
In your screenshot of Reaper's "MIDI item properties" box, it says "Ticks per quarter note: 960". This suggests Reaper is making a MIDI file with 960 MIDI ticks per quarter note, so that would mean a quarter note in the MIDI file would import into FamiTracker as 40 rows.
I can understand setting the tempo to 12 BPM, pressing the record button, then playing along with a 120 BPM metronome. In that case, you're basically recording notes 1/10 of the size -- 10 "played" quarter notes (96 MIDI ticks each) fit into 1 "Reaper" quarter note (960 MIDI ticks). But I think that should import into FamiTracker as 4 rows per "played" quarter note. So I don't understand why it imported as 5 rows per "played" quarter note for you.
However, instead of playing around with tempos further trying to understand that, it might be easier to change the MIDI ticks per quarter note. I took a look at the Reaper manual, and it looks like you can change MIDI ticks per quarter note as follows:
1. Go to the Options menu and choose Preferences.
2. Go to the category Media and the sub-category MIDI.
3. Change the value for "Ticks per quarter note for new MIDI items".
To make a quarter note in the MIDI file become 4 rows in FamiTracker, set Reaper to use 96 MIDI ticks per quarter note.
If you want a different relationship between MIDI quarter notes and FamiTracker rows, figure out the correct MIDI ticks per quarter note to use based on this fact: during the import, FamiTracker considers each row to be 24 MIDI ticks.
Thanks again. I'll give it a try.
I tried it, works perfectly, I'm going to update the MIDI section to indicate the better method, Bavi_H.
Options/Preferences/MIDI/ 96 ticks per quarter note.
You get 4 rows per note in Famitracker 0.4.2, through MIDI import.
Very nice tutorials! I didn't even realize C was an option for NES programming.
You might want to consider adding a slight background to the code examples so it's easier to tell what is code instead of explanation. Otherwise, great stuff!
Thanks, I'll consider making color changes to the code...probably better I should improve the code, maybe.
I'm considering adding a few pages on ASM lessons, but I'm not sure how I should organize it. I don't want to go alphabetically, as some documents do...I wanted to start with simple LDA, STA...but it occurred to me that some of the more complicated LDA modes (indirect) probably shouldn't go on the first page...
Maybe I should start with math and bit shifting, and simple zero page LDA, STA.
dougeff wrote:
I'm considering adding a few pages on ASM lessons, but I'm not sure how I should organize it. I don't want to go alphabetically, as some documents do...I wanted to start with simple LDA, STA...but it occurred to me that some of the more complicated LDA modes (indirect) probably shouldn't go on the first page...
Maybe I should start with math and bit shifting, and simple zero page LDA, STA.
I'd say start with explaining what the Accumulator and index registers are, and then introducing LDA, STA; LDX, STX; LDY, STY. Then explain the difference between an address and an immediate value (With maybe a brief mention of the zero-page). After that, you could gradually introduce new instructions, categorized by math, bitwise operations, logic, etc., and then later on go into more advanced addressing methods like Indexed and Indirect.
Just an idea.
First of all , thank dougeff for the work he has done in his blog. He had long needed a tutorial and hands. I must also apologize for my bad English . This post has been translated by Google translator .
However , there are many things that escape me despite being fairly well explained .
My radical doubt ignorance when it comes to creating longer than two simple horizontal scroll Nametables levels.
How I can do to make the updating nametable go slowly while the background moves ?
Thank you.
PD: en español debajo de estas lineas.
Antes de nada, dar las gracias a dougeff por el trabajo que ha hecho en su blog. Hacía tiempo que necesitaba tener un tutorial así entre manos. También debo disculparme por mi mal ingles. Este post ha sido traducido mediante Google Traductor.
No obstante, hay muchas cosas que se me escapan pese a estar bastante bien explicado.
Mi duda radical en el desconocimiento a la hora de crear niveles más largos que dos simples Nametables en scroll horizontal.
¿Como puedo hacer para que el nametable se vaya actualizando poco a poco mientras avanza el background?
Gracias.
Keep the camera position in a variable. This camera position defines what area is visible. From it, you can calculate the position of the column of tiles that is next to appear. If the camera has not yet advanced to a new column of tiles, there's nothing new to see, so don't update anything. But if it has, then prepare a list of tile numbers, and then copy it into nametable memory during the next vblank.
The camera, represented by a red bracket, has advanced into the column of blocks that need to be updated.
To add a bit...
'camera position' = Horizontal Scroll, the first number stored at register $2005
When Horizontal Scoll rolls over (>255) you tell it to change base nametables, controlled by the $2000 register.
xxxx xx00 = nametable #0, left screen
xxxx xx01 = nametable #1, right screen
Nerdy Nights also has a tutorial on this...
http://nintendoage.com/forum/messagevie ... adid=36958
The truth is that this is very complicated . At the end I relied on lesson number 11 dougeff to get done scrolling .
I climb a small demo:
https://www.dropbox.com/home/public/NES ... 022016.nesWhy did the rare artifacts appear in the sky ? My NameTable is not it. By not charging it
PD : Remember that 'm programming based on the lessons dougeff
Spanish:
La verdad es que esto es muy complicado. Al final me he basado en la lección número 11 de dougeff para conseguir hacer scrolling.
Os subo una pequeña demo.
¿Por que me aparecen los artefactos raros en el cielo? Mi Nametable no es así. No se por que carga eso
PD: Recordad que estoy programando basandome en las lecciones de shiru
Definitely attribute table has wrong color palette defined for those bits of screen. Maybe you're writing too many byes to the PPU, and it's corrupting the attribute table.
dougeff wrote:
Definitely attribute table has wrong color palette defined for those bits of screen. Maybe you're writing too many byes to the PPU, and it's corrupting the attribute table.
Mmmm , I've been messing with the code of the lesson 10 and if I modify the allocation of tiles from heaven , the same fault occurs.
Simply assigning address 1 where previously put 0 in the tiles, and skips the error.
const unsigned char METATILES[]={
0, 0, 0, 0, //0 sky
....
const unsigned char METATILES[]={
1, 1, 1, 1, //0 sky
...
Could it be a problem of the code itself ? Any misconfiguration ?
I'll look at it later today.
dougeff wrote:
I'll look at it later today.
Thanks!!!
Apparently I forgot how to do math when I wrote this one, and it was only by dumb luck that it worked to begin with.
The tile buffering system was buggy...each was overflowing into the next one because...
(in BufferMT.c)...
while (index < 16){
should have been...
while (index < 15){
Since the screen is only 15 metatiles high (*16 pixels = 240)...and it should index from 0-14.
And I also increased the buffer sizes (in lesson11.c)...slightly larger than we need, but now easier to read in the hex editor debugging tool...
unsigned char BUFFER1[32];//left column 1
unsigned char BUFFER2[32];//right column 1
unsigned char BUFFER3[32];//left column 2
unsigned char BUFFER4[32];//right column 2
(before I erroneously had them 26, but was indexing up to 30).
Anyway, the link has been updated...sorry for the trouble.
http://dl.dropboxusercontent.com/s/08oi ... sson11.zip
Ouh yeah!!! Thank you dougeff
Indeed , that was the problem . I keep messing with this demo.
Continuing research the code, I decided to increase the number of rooms 4 to 5 (for example)
I created a new CSV called A5.csv
I stuck a #include " BG / A5.csv "
In the array ROOMS [ ] I added (int) & A5
In move_logic() function, variable Room &= 3 converted it to 4.
In New_Room() function, RoomB &= 3 variable became to 4.
In the main() function, RoomPlus &= 3 variable became to 4.
Still, I can not get this new room appears.
Am I skipping a step?
Spanish:
Continuando con la investigación del código, me propuse aumentar el numero de habitaciones de 4 a 5 (por ejemplo)
Creé un nuevo CSV llamado A5.csv,
Metí un #include "BG/A5.csv"
En el array ROOMS[] añadí (int) &A5
En la función move_logic, la variable Room &= 3 la convertí a 4.
En la función New_Room, la variable RoomB &= 3 la convertí a 4.
En la función main, la variable RoomPlus &= 3 la convertí a 4.
Pese a todo, no consigo que aparezca esta nueva habitación.
¿Me estoy saltando algún paso?
It's a problem with powers of two and how arithmetic AND works as a fast module. As a rule of thumb, "a % b" can be calculated as "a & (b-1)" if b is a power of two. As 5 is not a power of two, you can't do the trick here.
Always stick to powers of two, otherwise you'll have to use modulus (a no-no as it is slow) or check for boundaries by hand when you increment or decrement the value to adjust.
Long story short: use 8 rooms and calculate the modulus using & 7, or use 16 and use & 15 and so forth. Only valid for powers of two.
Long story long, you'll see this with an example.
7 % 4 = 3.
7 & 3 = 3 -> 00000111 & 00000011 = 00000011.
.
7 % 5 = 2.
7 & 4 = 4!! -> 00000111 & 00000100 = 00000100. Not right.
I myself tend to use 16 rooms for horizontal scrolling stripes in my maps as you get a fairly large area to play, 256 tiles wide. Maths are simple as you are dealing with powers of two, plus you can use fixed point maths with 4 bits of precision. 16 rooms are 4096 pixels wide. 4 bits of fixed point precission gives you a minimum increment of 1/16 of a pixel, and you can store X coodinates using a simple unsigned int (max 65536 values = 4096 x 16).
I'm away from my computer, but I think I understand you...
So my code does...
++Room;
Room &= 3; to keep it 0-3
Changing to...
Room &= 4;
Won't work. It will merely keep you in room #0.
You need to...
If (Room > 4) Room = 0;
Yes, dougeff
Parameters
Room &= 4;
RoomPlus &= 4;
RoomB &= 4;
etc... not working properly.
Using
if (Room > 4) Room = 0;
if (RoomPlus > 4) RoomPlus = 0;
if (RoomB > 4) RoomB = 0;
etc... and it works as I want.
Once again, thank you so much!
Reviewing the lessons I have seen that there is something not clear in Lesson 1...
How do you know the program where the tiles of letters CHR are? Magic?
Is it TEXT [] a predetermined array
________________________________________________________________________
Spanish:
Repasando las lecciones he visto que en la lección 1 hay algo que no tengo claro...
¿Como sabe el programa donde están los tiles de las letras del CHR? ¿Magia?
¿Es acaso TEXT[] un array predeterminado?
If a program uses CHR ROM, and the mapper is NROM or something else incapable of bank switching CHR ROM, then the tiles are already available to the PPU from the moment the power is switched on.
CHR RAM or bank switching would complicate things, as the program would have to do some setup in order to make the tiles available. In the case of bank switching, it would have to write bank numbers to the mapper's ports. In the case of CHR RAM, it would have to copy the tiles from PRG ROM to CHR RAM through the PPU.
tepples wrote:
If a program uses CHR ROM, and the mapper is NROM or something else incapable of bank switching NROM, then the tiles are already available to the PPU from the moment the power is switched on.
CHR RAM or bank switching would complicate things, as the program would have to do some setup in order to make the tiles available. In the case of bank switching, it would have to write bank numbers to the mapper's ports. In the case of CHR RAM, it would have to copy the tiles from PRG ROM to CHR RAM through the PPU.
Ok, thanks for the explanation.
I'm still progressing slowly. Now I'm involved with collisions. For now all right.
I made my character move a box around the screen. I've also got to shoot.
Now my question. How do I get sprite disappear from the screen? For example, when shooting touch the box, these two items disappear. How do I delete them?
________________________________________________________________________________
Spanish:
Sigo progresando poco a poco. Ahora ando liado con las colisiones. Por ahora todo correcto.
He conseguido que mi personaje mueva una caja por la pantalla. También he conseguido que dispare.
Ahora mi duda ¿Como consigo que desaparezcan sprites de la pantalla? Por ejemplo, cuando el disparo toque la caja, que estos dos objetos desaparezcan.
Diskover wrote:
How do I get sprite disappear from the screen? For example, when shooting touch the box, these two items disappear. How do I delete them?
Normally, game objects are placed into RAM slots. Game engines are made to support a certain number of active objects, and every frame these objects are updated and drawn. Deleting an object is just a matter of freeing up its slot, so the object will cease to exist and will not be updated or drawn anymore. Games with a constant number of objects, such as pong (which always has 2 paddles and 1 ball), can get away with having hardcoded objects, but any game that involve objects being created and destroyed dynamically needs to have some sort of object management.
When a bullet hits an enemy, the common thing to do is destroy both objects (freeing up their slots) and create a new explosion object.
Also, giving a sprite a Y coordinate between $f0-ff will put it off the screen.
But, I also agree with Tokumaru. You'll need a system that (every frame) looks at what sprite objects are active, and how it should be drawn on screen, and reinserts those values into the Sprite Buffer...and clear the unused sprites by inserting a Y value above $f0. Every frame.
Ideally you would also shuffle the sprites 'priority' (position within the buffer) on every pass, (because of the 8 sprite per scanline limit) so that they* flicker rather than disappear.
*in this context I mean the 9th sprite on screen on a scanline will disappear. Unrelated to your question about making sprites disappear.
Quote:
Now my question. How do I get sprite disappear from the screen? For example, when shooting touch the box, these two items disappear. How do I delete them?
Knowing how to program an object system is fundamental to any platform, including the nes,
as was graciously explained to me a while back.
tokumaru wrote:
Normally, game objects are placed into RAM slots. Game engines are made to support a certain number of active objects, and every frame these objects are updated and drawn. Deleting an object is just a matter of freeing up its slot, so the object will cease to exist and will not be updated or drawn anymore. Games with a constant number of objects, such as pong (which always has 2 paddles and 1 ball), can get away with having hardcoded objects, but any game that involve objects being created and destroyed dynamically needs to have some sort of object management.
When a bullet hits an enemy, the common thing to do is destroy both objects (freeing up their slots) and create a new explosion object.
dougeff wrote:
Also, giving a sprite a Y coordinate between $f0-ff will put it off the screen.
But, I also agree with Tokumaru. You'll need a system that (every frame) looks at what sprite objects are active, and how it should be drawn on screen, and reinserts those values into the Sprite Buffer...and clear the unused sprites by inserting a Y value above $f0. Every frame.
Ideally you would also shuffle the sprites 'priority' (position within the buffer) on every pass, (because of the 8 sprite per scanline limit) so that they* flicker rather than disappear.
*in this context I mean the 9th sprite on screen on a scanline will disappear. Unrelated to your question about making sprites disappear.
Sogona wrote:
Knowing how to program an object system is fundamental to any platform, including the nes,
as was graciously explained to me a while back.Ok, guys. I note your advice. Not if I'm doing well but at least they are leaving things.
I pass a demo of what I'm getting to see what you think.
My character can move a box and also shoot to destroy this or the enemy.
DEMO alpha 0.06:
https://dl.dropboxusercontent.com/u/319 ... 200.06.nesSpanish:
Ok, chicos. Tengo en cuenta vuestros consejos. No se si lo estoy haciendo bien pero al menos van saliendo cosas.
Os paso una demo de lo que voy consiguiendo a ver que os parece.
Mi personaje puede mover una caja y además disparar para destruir esta o el enemigo.
Looks good, keep working at it...
I see you're using the my controller read code that is buggy. I've since then updated all the links on my blog...see this discussion for the newer code...
viewtopic.php?f=2&t=13796
dougeff wrote:
Looks good, keep working at it...
I see you're using the my controller read code that is buggy. I've since then updated all the links on my blog...see this discussion for the newer code...
viewtopic.php?f=2&t=13796Thanks!!!
I've updated the code controls
How did you know I was using the old?
Spanish:
Ya he actualizado el código de los controles
¿Como sabias que estaba usando el antiguo?
I do a lot of Rom Hacking. Plus, I was worried that people are using the buggy code.
Anyway, I set a breakpoint for reads from $4016.
I have a few entertaining days with this issue and there have been several advances, such as the doors to move from a screen to another.
Right now I am studying the issue of creating a menu and at least've already got that pressing START freeze the action, change the screen and get back as it was before.
This morning I created the item selection arrow. It has emerged a rare problem. By putting the arrow, two rare tiles appear at the top of the screen. I do not understand that appear.
Similar cases passed when I did not give a memory size to declare the array of "SPRITES_X[]", in this case FLECHA_SPRITES[0x04];. Nor if this could be the error.
Can you think of anything?
ROM:
https://dl.dropboxusercontent.com/u/319 ... 200.08.nesSpanish:
Llevo unos días entretenido con este tema y ha habido varios avances, como por ejemplo las puertas para pasar de unas pantallas a otras.
Ahora mismo estoy estudiando el tema de crear un menu y al menos ya he conseguido que al pulsar START congelar la accion, cambiemos la pantalla y poder volver como estaba antes.
Esta mañana he creado la flecha de seleccion de item. Me ha surgido un problema raro. Al poner la flecha, aparecen dos tiles raros en la parte superior de la pantalla. No entiendo por que aparecen. Casos parecidos me pasaban cuando no daba una tamaño de memoria al declarar el array de "SPRITES_X[]" , en este caso FLECHA_SPRITES[0x04];. Tampoco se si este podría ser el error.
¿Se os ocurre algo?
Without the source code, I won't be able to figure it out. Clearly, 2 incorrect spites are appearing.
Have the linker make a label file. Find the address of the label that handles sprites. Set a breakpoint in the FCEUX for execution at that address (at the same time it happens). Step through the code, and see what numbers are being loaded to the Sprite area of RAM.
Or, review that RAM in the hex editor tool, and see the exact address of the strange sprite. Set a breakpoint for writes to that RAM area, and try to figure out where those numbers are coming from.
Learning to debug is an important lesson. I spend at least 25% of my time fixing bugs.
dougeff wrote:
Without the source code, I won't be able to figure it out. Clearly, 2 incorrect spites are appearing.
Have the linker make a label file. Find the address of the label that handles sprites. Set a breakpoint in the FCEUX for execution at that address (at the same time it happens). Step through the code, and see what numbers are being loaded to the Sprite area of RAM.
Or, review that RAM in the hex editor tool, and see the exact address of the strange sprite. Set a breakpoint for writes to that RAM area, and try to figure out where those numbers are coming from.
Learning to debug is an important lesson. I spend at least 25% of my time fixing bugs.
Ok, sorry for the inconvenience. And I detected the fault. Upon update the sprite, the FOR performed it four times instead of one. Why did that bug.
Thank you for your explanations. There are things that I find hard to understand, but I try to move on. I look much your example to find solutions, and as you see, for now it seems that everything works fine.
What do you like my alpha 0.08? This afternoon add weapon selection on the menu.
And a technical question is it better to have a single function with many instructions or have multiple functions with separate instructions? For example, in the function movi_logic() I store the movement of all: player, enemies menu arrow, weapons, etc ... would it be better to give a movement function separately to each thing? Does it affect the performance?
UPDATE: I have already managed to make a functional menu, and then select to catch and weapon.
https://dl.dropboxusercontent.com/u/319 ... 00.09c.nesSpanish:
Ok, perdona por la molestia. Ya detecté el fallo. Al hacer el update del sprite, el FOR lo realizaba cuatro veces en vez de una. Por eso daba ese error.
Gracias por tus explicaciones. Hay cosas que me resultan difíciles de entender, pero me esfuerzo por seguir adelante. Miro mucho tus ejemplo para buscar soluciones, y como ves, por ahora parece que todo funciona bien.
¿Que te a parecido mi alpha 0.08? Esta tarde añadiré la selección de armas en el menú.
Y una pregunta técnica ¿es mejor tener una sola función con muchas instrucciones o tener varias funciones con instrucciones separadas? Por ejemplo, en la función movi_logic almaceno el movimiento de TODO: player, enemigos, flecha del menu, armas, etc... ¿seria mejor dar una funcion de movimiento separada a cada cosa? ¿afecta al rendimiento?
UPDATE: Ya he conseguido hacer un menú funcional, y poder coger y luego seleccionar arma.
Diskover wrote:
And a technical question is it better to have a single function with many instructions or have multiple functions with separate instructions? For example, in the function movi_logic() I store the movement of all: player, enemies menu arrow, weapons, etc ... would it be better to give a movement function separately to each thing? Does it affect the performance?
It's generally easier and more organized to isolate tasks as much as possible. There is an impact on performance, since each function call needs at least 12 cycles more than inlined code, because of the JSR and the RTS, plus whatever time is needed to handle parameters, return values and such, but depending on the overall complexity of the game, that might not even make a difference.
I suggest you try to make things as organized as you can, so the code is easy to maintain. Use whatever structure you're most comfortable working with. Then later on, IF
you detect any performance issues, you do something about it.
cc65 needs inline functions. That would make code both readable and fast in those cases you are using functions that will be called just once in your code (just for organization). I've coded some stuff for the SEGA 8-bit consoles in SDCC and inline functions, which that compiler supports, are a great feature.
In my case, I use a dirty trick (dirty, dirty) if I want to keep things organized but I don't want to spend cycles calling functions (for example, when packing up small tasks from NMI to sprite 0 collision detection for splits): I just put the code to another file and use a #include in the main. This is ugly C code, but it does the job
for me.
Code:
switch (ent) {
case 1:
#include "enems_linear.h"
break;
case 2:
#include "enems_gyrosaw.h"
break;
case 3:
#include "enems_buzzer.h"
break;
case 4:
#include "enems_pezon.h"
break;
case 5:
#include "enems_fanty.h"
break;
case 6:
#include "enems_bloco.h"
break;
case 7:
#include "enems_final_frog.h"
break;
}
You can use a preprocessor macro (#define) as a substitute for inline functions. It's also handy for unrolling loops.
Edit: rainwarror was faster.
Yeah, inline functions would be good here.
Because for every called sub routine, the assembly code uses a JSR and an RTS. But writing everything in one function could become hard to read.
However, as a workaround, you can use macros:
Code:
#define MovePlayer()\
do\
{\
Task1();\
Task2();\
}\
while (0)
The
do while (0) is done, so that you don't run into problems in this situation:
Code:
if (a == 3)
MovePlayer();
else
SomethingElse();
If you didn't have the
do while (0), the compiler would complain because of the semicolon since the macro would be translated to this:
Code:
if (a == 3)
{
Task1();
Task2();
}; /* <-- Error if followed by an else. */
else
SomethingElse();
So, you would be forced to write:
Code:
if (a == 3)
MovePlayer()
else
SomethingElse();
which doesn't look good.
That's why
do while (0) is the best. (It doesn't add anything to the compiled code. Both versions would create exactly the same output.)
I use defines it for simple functions I want to inline and of course to unroll loops, but the syntax has its caveats and it's a bit ugly, so when the function is complex, it can be a PITA to maintain if it's writen as a define. That's why I use the #include hack for longer, more complex functions I'd rather inline.
I would say if the function is really complex, then an additional JSR, RTS won't really make a difference.
That's why I have a whole bunch of MoveCharacter functions (one for each different type of character) which are only used in one location in the whole program.
So, yes, the algorithms could be written directly since the functions are not resuable.
And yes, using a macro for these long functions would look really ugly.
So, I just use a regular function and call it a day. That's four additional JSR, RTS per frame. If the game will ever lag, I don't think that will be the main issue.
na_th_an wrote:
cc65 needs inline functions. That would make code both readable and fast in those cases you are using functions that will be called just once in your code (just for organization). I've coded some stuff for the SEGA 8-bit consoles in SDCC and inline functions, which that compiler supports, are a great feature.
In my case, I use a dirty trick (dirty, dirty) if I want to keep things organized but I don't want to spend cycles calling functions (for example, when packing up small tasks from NMI to sprite 0 collision detection for splits): I just put the code to another file and use a #include in the main. This is ugly C code, but it does the job for me.
I will study. Sometimes I use as I should switch.
Spanish:
Lo estudiaré. Algunas veces uso switch según me conviene.
rainwarrior wrote:
You can use a preprocessor macro (#define) as a substitute for inline functions. It's also handy for unrolling loops.
DRW wrote:
Edit: rainwarror was faster.
Yeah, inline functions would be good here.
Because for every called sub routine, the assembly code uses a JSR and an RTS. But writing everything in one function could become hard to read.
If, in fact I already use quite the macro (#define), especially on the issue of ordering some things and that code does not become something endless. It is more comfortable to work.
Spanish:
Si, de hecho ya utilizo bastante las macro (#define), sobre todo por el tema de ordenar un poco las cosas y que el código no se convierta en algo interminable. Es más cómodo de trabajar.
Well, I think you have to teach a little of my progress in this area. Little by little I'm improving something:
- Added select menu items.
- Added collision box walls.
- Added lives, energy and points.
- Added screen Continue and Game Over.
- Added top scorer. It contains points system, selected weapon, lives and energy player.
- I added two more maps.
- I added music and sound (testing).
Some of these things I had to guide me a little with the demo Spacy3 dougeff, especially with the score and music. The music is taken directly from that demo. I have been researching the operation of Famitone, and the truth is that it is very complicated. I'll leave it to the last of all.
ROM:
https://dl.dropboxusercontent.com/u/319 ... 00.12b.nes Spanish:
Bueno, creo que os tengo que enseñar un poco de mis avances en la materia. Poco a poco voy mejorando alguna cosa:
- Añadido menú de selección de objetos.
- Añadido colisión de caja con paredes.
- Añadidas vidas, energía y puntos.
- Añadido pantalla de Continue y Game Over.
- Añadido marcador superior. Contiene sistema de puntos, arma seleccionada, vidas y energía del player.
- He añadido dos mapas más.
- He añadido música y sonido (en pruebas).
Parte de estas cosas me he tenido que guiar un poco con la demo Spacy3 de dougeff, sobre todo con el marcador y la música. La música esta cogida directamente de esa demo. He estado investigando el funcionamiento de Famitone, y la verdad es que es muy complicado. Lo dejaré para lo último de todo.
Looks good.
Maybe now would be a good time to start your own forum thread... perhaps over in the 'homebrew' section.
Pushing the box is a little buggy.
Also, wait till you're in v-blank before turning on an off the screen. You're getting 1 frame of misaligned screens, every time you press 'start' or change rooms.
Example (por ejemplo) you step into a door, which triggers the code to load the next room...
1.wait for v-blank...as...
from my example code, wait for NMI_flag != 0
2. Now, turn the screen 'off' ...All_Off();
3. Load the next room's data to PPU
4. Blank the sprites from the last room (move them off screen)...Blank_sprite();
5. Set the position of hero for new room, and position of other sprites, etc.
6. wait for v-blank...Wait_Vblank();
_Wait_Vblank:
lda $2002
bpl _Wait_Vblank
7. reset NMI_flag = 0;
8. turn the screen'on' and nmi's 'on' ...All_On();
9. back to regular game loop
dougeff wrote:
Pushing the box is a little buggy.
Also, wait till you're in v-blank before turning on an off the screen. You're getting 1 frame of misaligned screens, every time you press 'start' or change rooms.
Example (por ejemplo) you step into a door, which triggers the code to load the next room...
1.wait for v-blank...as...
from my example code, wait for NMI_flag != 0
2. Now, turn the screen 'off' ...All_Off();
3. Load the next room's data to PPU
4. Blank the sprites from the last room (move them off screen)...Blank_sprite();
5. Set the position of hero for new room, and position of other sprites, etc.
6. wait for v-blank...Wait_Vblank();
_Wait_Vblank:
lda $2002
bpl _Wait_Vblank
7. reset NMI_flag = 0;
8. turn the screen'on' and nmi's 'on' ...All_On();
9. back to regular game loop
Basically I do all that you say, but nevertheless do not get to solve that problem.
It seems silly but that frame gets to be problems with the collision map by charging and that the player has not yet been put in place.
As much as I try and change, I can not fix the problem.
Spanish:
Básicamente hago todo eso que dices, pero sin embargo no consigo solucionar ese problema.
Parece una tontería pero ese frame consigue que haya problemas con el mapa de colisiones al cargarle y que el player no se haya colocado todavía en su sitio.
Por más que pruebo y cambio, no consigo solucionar el problema.
I don't have your source code, but setting a breakpoint for writes to $2001 and $2005-6 shows me this...while pressing 'Start'. (and viewing the sprite buffer in the hex editor).
-screen is turned off at scanline 79, giving 1 frame that is 1/4 drawn.
-while the sprite buffer is set to have all Y positions at $f8 (off screen), the buffer is not pushed to the OAM, so when the screen is turned back on...(at scanline 241, the start of v-blank)...the OAM is still filled with the last screen's sprites. Sprites are visible for 1 frame.
(Start pressed again)...
-screen is turned off at scanline 24, which shows no visible errors, but is technically not the right time to turn off the screen (outside of v-blank)
-screen is turned on at scanline 241 (the start of v-blank)...but for a whole frame, the scroll position is not set, and the PPU address $28c0 is still in the PPU address, which misaligns the screen for 1 frame.
OK, also...you are setting $2000-2001 twice every frame...both during v-blank, so no visible errors, but still not necessary. And, the screen is flip-flopping back and forth between nametable #0 and nametable #1. It's not visible, because both sceens are loaded with identical data, but certainly not necessary (as far as I can tell).
So...solutions.
Add...
OAM_ADDRESS = 0;
OAM_DMA = 2;
after you blank the sprites.
Add...
SCROLL = 0;
SCROLL = 0;
at the end of the All_On function
figure out why $2000/2001 is being written to twice, why nametables are switching, and how to wait for v-blank before turning the screen off (preferably using the NMI_flag).
I've added a few pages on 6502 ASM coding to my blog.
http://nesdoug.com/2016/03/10/26-asm-basics/If anyone out there (who knows this stuff) is willing to look it over, and make sure I didn't make any errors, I would appreciate it.
Although, no rush, really. My statistic thing says I only get about 40 viewers a day, and almost none have them looked at the ASM pages. I just don't like to post errant information. Speaking of...
This document, which I had on my computer, and referenced when looking up ASM instructions...
http://nesdev.com/6502.txtIncorrectly lists PLA as setting no flags.
Hey, I made my first YouTube instructable video today. It's not very good (I did about 5 minutes of prep). It's just a quick how-to on using NES screen tool.
https://youtu.be/BtV_NCwWAqsI have plans to make a few more on FCEUX debugging tools, and such.
(Interesting side note, I modified the palette of my NES Screen Tool and replaced it with Kizuls palette
viewtopic.php?f=21&t=13555The palette was easy to find {hex editor} since Shiru posted the source code).
I saw your video. Should be useful for beginners, I did find the loud clicking a bit distracting, I'm not sure if the mic was picking that up, or if it is a part of your recording software.
I made a video about using the debugging tools of FCEUX. I probably forgot about a dozen things to mention, but it was getting pretty long.
https://youtu.be/d2XkJQFs0OQ
I've updated every example code (except for the Spacy Shooty game code, which I plan to rewrite from scratch).
Most of the changes are cosmetic (make comments easier to read), or just trying to make the code more stable.
-added a second v-blank wait in the startup code, before writing to PPU. Which I must have accidentally removed and never put back in.
-moved 'things that need to be done every frame' (like sprite DMA) to NMI code
-added a write to a000 (mirroring) to the init code for my MMC3 examples
-removed the copy of nes.lib from every zip, which apparently was completely unnecessary to keep a copy of, since cc65 is able to find its own copy
At some point, I will also update the Spacy Shooty example code.
dougeff wrote:
I've updated every example code (except for the Spacy Shooty game code, which I plan to rewrite from scratch).
Most of the changes are cosmetic (make comments easier to read), or just trying to make the code more stable.
-added a second v-blank wait in the startup code, before writing to PPU. Which I must have accidentally removed and never put back in.
-moved 'things that need to be done every frame' (like sprite DMA) to NMI code
-added a write to a000 (mirroring) to the init code for my MMC3 examples
-removed the copy of nes.lib from every zip, which apparently was completely unnecessary to keep a copy of, since cc65 is able to find its own copy
At some point, I will also update the Spacy Shooty example code.
Great!
I've decided to finally do some testing with structs vs cc65.
If you declare the actual struct in the global space, it puts them in the BSS, and actually takes about as much time to access as any other variable...
Code:
struct foo {
unsigned char X;
int Y;
int Z;
};
struct foo B;
void main (void){
B.X = 4;
B.Y = 5;
}
compiles to ...
Code:
lda #$04
sta _B
ldx #$00
lda #$05
sta _B+1
stx _B+1+1
If, however, you put the struct in the local space, it puts them in the C stack.
Code:
void main (void){
struct foo C;
C.X = 3;
C.Y = 4;
}
compiles to...
Code:
jsr decsp5
lda #$03
ldy #$00
sta (sp),y
iny
lda #$04
sta (sp),y
lda #$00
iny
sta (sp),y
Conclusions, just like variables are faster in cc65 if decared globally, structs seem to also be faster if declared globally. And, much better than I thought.
Yeah, structs by itself are not too bad. It's arrays of structs indexed by non-constants that can be problematic, e.g.:
Code:
struct foo {
unsigned char X;
int Y;
int Z;
};
struct foo B[5];
unsigned char i;
void main (void){
for ( i = 0; i < 5; ++i ) {
B[i].X = 4;
B[i].Y = 123;
}
}
The above code has to generate code to multiply the index by the struct size (5) to index the array.
"Structs of arrays" is better:
Code:
struct foo {
unsigned char X[5];
int Y[5];
int Z[5];
};
struct foo B;
unsigned char i;
void main (void){
for ( i = 0; i < 5; ++i ) {
B.X[i] = 4;
B.Y[i] = 123;
}
}
However, this code still has the problem that the 16-bit Y and Z need a multiplication by 2 to access them. Splitting them into separate byte-sized YLo, YHi, ZLo, ZHi members could generate more optimal code, but that in turn would complicate the actual use of those members (say, if you want to add or assign a value to "YLo, YHi").
Things I figured out this weekend, and will be corrected on my blog.
I wrote the code for most of the pages very quickly, and occasionally I would get error messages from the cc65 compiler. "converting pointer to int without a cast" "incompatible pointer type" etc...and I didn't know what caused them, but I slapped an (int) on there, and the error message went away. But, it didn't look right to me, and I could never find any example code that required type casting to fix error messages...so it bothered me a bit.
Well, the reason I never found example code to match what I was doing, was because I was doing things wrong. The ASM code was correct, so I assumed I had correctly addressed the issue, but once I saw the correct answer...I see that I hadn't.
example...
Code:
const int AllBackgrounds[] = {(int) &n1,(int) &n2,(int) &n3,(int) &n4 };
I slapped some (int)'s on there because it gave me error messages...but what I really wanted was this...
Code:
const unsigned char * const All_Backgrounds[]={n1,n2,n3,n4};
and the companion piece...
Code:
UnRLE(BGDaddress);
I believe the error was that my prototype said this...
Code:
void __fastcall__ UnRLE(int data);
and, what I really wanted was this...
Code:
void __fastcall__ UnRLE(const unsigned char *data);
because, what I'm really doing with the code, is an array of constant pointers to an array of constant characters. And, what I'm really passing to the function is a pointer to an array.
This will be fixed soon on the blog example code.
Further, I don't think I've fully tested 'controller 2' input code. All my example code only tests 'controller 1'. I will have to do that as well.EDIT, I tested it. Works fine.
I've updated every example code on the blog. As usual, if anyone spots any outrageous bugs or bad programming practices, let me know. Thanks.
Here's a quick link to the Spacy Shooty source code...
http://dl.dropboxusercontent.com/s/70f8 ... Spacy4.zip
Update (10-17-2016) I updated reset.s in every file, to make sure that initlib and copydata were included. Also changed, added Wait_Vblank(); to several files, just before rendering was turned on, to fix 1 frame of misaligned screens. Finally, changed the .cfg file on the MMC3 examples, to include the missing segments that I had deleted.
See here for further discussion on missing 'copydata'...causing errors.
viewtopic.php?f=10&t=14947
Update Feb 9, 2017
I changed every .cfg file to include a "ONCE" segment, so it will compile with the latest version of cc65.
I added a makefile for Linux users, and people who prefer Gnu Make to .bat files. Well, Linux users will have to edit the makefile slightly. I originally wrote them on a Linux computer, but then brought them over to my Windows computer, and edited them to work there...
...anyway, Linux users will have to uncomment out the lines rm *.o and comment the lines del *.o. (etc for .nes files under CLEAN:)
UNRELATED SIDENOTE:
I wrote a 6502 disassembler in python. I might post it in a few weeks.
If you are using Make from MSYS, you'll probably have GNU Coreutils, which includes
rm. For other things that tend to vary, such as presence or absence of
.exe in the name of a native executable produced by the linker, you can use the presence or absence of environment variable
COMSPEC to set makefile variables.
See
Writing portable makefiles.
thefox wrote:
However, this code still has the problem that the 16-bit Y and Z need a multiplication by 2 to access them.
Isn't multiplication of 2, 4, 8, 16 etc. unproblematic since the compiler can turn it into a simple bit shift? So, an array of ints shouldn't be that much of an issue. At least it's not comparable to the access complexity of an array of a struct.
thefox wrote:
Splitting them into separate byte-sized YLo, YHi, ZLo, ZHi members could generate more optimal code, but that in turn would complicate the actual use of those members (say, if you want to add or assign a value to "YLo, YHi").
Yeah, I would highly adivse against that. If you happen to need an integer in an NES game (which should be more the exception than the rule) let the compiler handle it. Don't fiddle around with two byte values if they are supposed to represent a single number.
DRW wrote:
thefox wrote:
However, this code still has the problem that the 16-bit Y and Z need a multiplication by 2 to access them.
Isn't multiplication of 2, 4, 8, 16 etc. unproblematic since the compiler can turn it into a simple bit shift? So, an array of ints shouldn't be that much of an issue. At least it's not comparable to the access complexity of an array of a struct.
Yeah it's not a huge problem, but non-optimal nevertheless.
DRW wrote:
thefox wrote:
However, this code still has the problem that the 16-bit Y and Z need a multiplication by 2 to access them.
Isn't multiplication of 2, 4, 8, 16 etc. unproblematic since the compiler can turn it into a simple bit shift? So, an array of ints shouldn't be that much of an issue. At least it's not comparable to the access complexity of an array of a struct.
It's not just a bit shift. If you can do an array access with an 8-bit index, it can just go into X or Y. If the index is wider, it can't do that anymore, and you get a 16-bit shift
plus a 16-bit add operation on a temporary pointer, and then on top of that the array access becomes indirect.
Here's an example:
Code:
unsigned char ac[35];
unsigned int ai[35];
void index_test()
{
static unsigned char i;
// 1.
i = index();
ac[i] = 5; // 8-bit index on 8-bit array
// 2.
i = index();
ac[i*2] = 6; // index is promoted to 16-bit int
// 3.
i = index() * 2;
ac[i] = 7; // index was implicitly cast back to 8-bit before use
// 4.
i = index();
ai[i] = 8; // index is promoted to 16-bit int by implicit mulitplication by 2
}
And the generated assembly:
Code:
; 1.
; i = index();
jsr _index
sta L0017
; ac[i] = 5;
ldy L0017
lda #$05
sta _ac,y
; 2.
; i = index();
jsr _index
sta L0017
; ac[i*2] = 6;
ldx #$00
lda L0017
asl a
bcc L3763
inx
clc
L3763: adc #<(_ac)
sta ptr1
txa
adc #>(_ac)
sta ptr1+1
lda #$06
ldy #$00
sta (ptr1),y
; 3.
; i = index() * 2;
jsr _index
asl a
sta L0017
; ac[i] = 7;
ldy L0017
lda #$07
sta _ac,y
; 4.
; i = index();
jsr _index
sta L0017
; ai[i] = 8;
ldx #$00
lda L0017
asl a
bcc L3764
inx
clc
L3764: adc #<(_ai)
sta ptr1
txa
adc #>(_ai)
sta ptr1+1
lda #$08
ldy #$00
sta (ptr1),y
iny
lda #$00
sta (ptr1),y
The difference between examples 2 and 3 especially shows how helpful it can be to undo integer promotion before accessing the array with it. With example 4, once you use arrays of 16-bit (or larger) types all indexed access becomes full 16-bit indirection, and you can't really do anything to stop that.
So... not as bad as a multiplication, but if you're looking to reduce some of your overhead, it's actually not a terrible idea to "manually" pack striped arrays. A syntax vs convenience tradeoff, though you could simplify the syntax with macros.
O.k., yeah, that makes sense. Maybe I could have optimized some stuff with this knowledge in my game because the x position of each character was an integer.
(y was a byte because I had a status bar at the top, so I could simply declare that every sprite that has a position within the status bar is declared as out of screen and I didn't render these sprites at all.)
I added a few more pages to my blog, using the neslib.
Also, the last one is a very, very simple PONG example game (with no scoreboard, since I wanted it to be as simple as possible).
https://nesdoug.com/2017/08/09/sprite-b ... sion-pong/
Great addition, thanks.
Have you considered Patreon? your input is really valuable.
Great contribution. Impatiently await article explaining how to do large scrolling with neslib
I've been meaning to program an all direction scrolling game. Like Crystalis. My main stumbling block, is I can't think of how to compress each 'room' (16x15 set of metatiles, which expands to 256x240 pixel area).
In an ideal world, the game should only need to find the exact metatiles for the next 16 pixels to the right, but once it's compressed, there's no easy way to do that...to uncompress just the exact metatiles I need.
So, in my head, I'm going to need to uncompress 4 full rooms into the RAM at any given time (since you can stand in the corner of 4 different rooms at the same time). That's going to eat up $400 bytes of RAM + Some bytes for a buffer of tiles needed to push to the PPU the next v-blank.
Anyway. It's not very simple.
And, without compression, the game might be very small, or need lots of bank switching, which is a whole other level of complexity, that might not suit a beginning tutorial.
Edit, on second thought, I should be able to fit 64 rooms (8x8) on an NROM sized game 242x64 = 15488 (extra 2 bytes for a pointer to the start of each room). Maybe I won't compress for the example code.
Edit2, and if all the rooms were uncompressed, they wouldn't have to be loaded to the RAM. I'll think about it some more.
There are different types of compression that allow almost random access to individual metatiles. I particularly use metatiles (256x256) of metatiles (128x128) of metatiles (64x64) of metatiles (32x32) of metatiles (16x16). Traversing the metatile structure until the 16x16 ones isn't particularly slow. Of course this scheme isn't ideal for all kinds of maps, but I'm sure you can use something more accessible than "RLE/LZ one whole room/screen into a solid binary block".
dougeff wrote:
I've been meaning to program an all direction scrolling game. Like Crystalis. My main stumbling block, is I can't think of how to compress each 'room' (16x15 set of metatiles, which expands to 256x240 pixel area).
In an ideal world, the game should only need to find the exact metatiles for the next 16 pixels to the right, but once it's compressed, there's no easy way to do that...to uncompress just the exact metatiles I need.
So, in my head, I'm going to need to uncompress 4 full rooms into the RAM at any given time (since you can stand in the corner of 4 different rooms at the same time). That's going to eat up $400 bytes of RAM + Some bytes for a buffer of tiles needed to push to the PPU the next v-blank.
Anyway. It's not very simple.
And, without compression, the game might be very small, or need lots of bank switching, which is a whole other level of complexity, that might not suit a beginning tutorial.
Edit, on second thought, I should be able to fit 64 rooms (8x8) on an NROM sized game 242x64 = 15488 (extra 2 bytes for a pointer to the start of each room). Maybe I won't compress for the example code.
Edit2, and if all the rooms were uncompressed, they wouldn't have to be loaded to the RAM. I'll think about it some more.
Talk to na_th_an
The scheduled Sir Ababol when he must have found the same problem and solved. . . I believe.
A quadtree-style scheme similar to what tokumaru proposes would take 5440 bytes:
- Grid of 8x8 top-level metatiles, each 256x256: 64 bytes
- Top left 128x128 metatile in each 256x256 metatile: 256 bytes
- Top right 128x128 metatile in each 256x256 metatile: 256 bytes
- Bottom left 128x128 metatile in each 256x256 metatile: 256 bytes
- Bottom right 128x128 metatile in each 256x256 metatile: 256 bytes
- Top left 64x64 metatile in each 128x128 metatile: 256 bytes
- Top right 64x64 metatile in each 128x128 metatile: 256 bytes
- Bottom left 64x64 metatile in each 128x128 metatile: 256 bytes
- Bottom right 64x64 metatile in each 128x128 metatile: 256 bytes
- Top left 32x32 metatile in each 64x64 metatile: 256 bytes
- Top right 32x32 metatile in each 64x64 metatile: 256 bytes
- Bottom left 32x32 metatile in each 64x64 metatile: 256 bytes
- Bottom right 32x32 metatile in each 64x64 metatile: 256 bytes
- Top left 16x16 metatile in each 32x32 metatile: 256 bytes
- Top right 16x16 metatile in each 32x32 metatile: 256 bytes
- Bottom left 16x16 metatile in each 32x32 metatile: 256 bytes
- Bottom right 16x16 metatile in each 32x32 metatile: 256 bytes
- Top left 8x8 tile in each 16x16 metatile: 256 bytes
- Top right 8x8 tile in each 16x16 metatile: 256 bytes
- Bottom left 8x8 tile in each 16x16 metatile: 256 bytes
- Bottom right 8x8 tile in each 16x16 metatile: 256 bytes
- Attribute of each 16x16 metatile: 256 bytes
Some of these tables can be made shorter based on how much repetition is in your actual map.
Another option is an object-based map, similar to how the
Super Mario Bros. and
Animal Crossing series represent maps. Represent the map as an (X, Y, thing) list, where the renderer calculates which objects overlap the column to be scrolled onto the screen. For an 8-way scroll, You'd probably need to sort this by (Y screen, X, Y within screen) so that you have only a couple 256-pixel-tall rows of objects to search. The advantage of objects over a quadtree is that repeated objects can be placed at arbitrary 16x16 tile offsets.
Another option:
Have a seamlessly repeating background picture (size of a screen, larger, or smaller, depending on style and how well it repeats without getting worn out), have it rle-compressed in ROM and keept it relatively simple so it doesn't get too big. Or if you can afford it, keep it uncompressed. Write it to screen first (in slices as you scroll).
Then have a number of objects (kind of like metroid or what tepples said) which overwrites the basic background. You get the creative freedom/small file size-compromise of metroid but get a good looking background rather than an empty black void.
Bug fix.
Forgot to disable APU frame counter IRQ in my neslib example codes, in crt0.s
ldx #$40
stx $4017
Files have been updated.
Maybe I could be mistaken but Shiru samples were not deactivating it. This is one thing I added to my own code once I did the review with my original init code and the one in the samples.
Nice addition!
You might want to mention that the reason the Zapper is read from the NES controller II port in licensed Zapper games is because that's how the original Famicom light gun works (using the same pins in the expansion port). If a game requires reading from controller I port (like Chiller does when using two Zappers) people with Famicom like me will be unable to play your game as those pins are not available on the Famicom. So in other words, if a game only supports one Zapper it is always best to read from port 2 (or both ports) for compatibility reasons.
For the Power Pad / Family Fun Fitness, it's just incompatible with the Famicom (due to some NES pins not being available in the Famicom expansion port) IIRC. Edit: I remembered poorly. The NES version of the mat seems to use the same pins as the Zapper according to the wiki, so the same thing applies for it. It is best to be read from either port 2 or both ports for full compatibility.