Hey all - trying out a handful of things that rock the current boat least. I have lateral scrolling working visually, and I have no problem getting collision data scroll offsets. But...just wondering peoples' methods of updating tile collision data when scrolling (I know there are a few schools of thought on this). Currently, the engine loads 240 bytes worth of collision data into a RAM table, and those table values are read. What would you guys suggest? Get the new-screen table, bump the column, write every value column by column, shifted against scroll direction, then write the new column in the scroll direction? It seems like that would be a lot of values to write at a time, constantly.
Any other good suggestions? Or would this be the most sensible method?
For my ninja game, I based it vaguely on SMB, and had 2 full arrays of 16x13=208 (32 pixels at the top for HUD didn't need data)...and collision data doubled as metatiles for updating the PPU as I scrolled.
So 2 arrays, because 2 nametables. And filled as I scrolled.
Right, but you had to move column 1 values to column 0, then 2 to 1, then 3 to 2...and so on...filling the new column to the right...correct?
No.
I didn't shift any values. Once it was in the array it stayed in place.
You need more than a screen's worth of data in RAM or collision gets weird near the edges. I used 32x32 metatiles (made of 16x16 metatiles) to fit 4 screens worth of tiles in a page of RAM. From there you can update the RAM column by column or whatever. It just gives you an offscreen buffer.
If you only scroll in one direction you end up with a HUGE collision buffer relative to the size of the screen. You could just opt for 512 pixels in either direction and then use the other half of the page for something else.
Kasumi - Yeah, I have a column worth of RAM space for leading offscreen buffering (my intention) and will likely reserve one for *trailing* to for things like monsters moving to edges and whatever.
But this is what you'd suggest knowing my current set up? Effectively shifting the thing column by column once the 16px 'metatile' boundary that I use is reached, and pushing the new column into the new place on the right? Do you recommend splitting this out over frames to make it sort of regulated and dependable, or just firing away? The 16 reads + 16 writes x 16 columns plus new data fetch seems long-ish, but I guess not so bad?
Suppose you have exactly one screen (256x256 for ease, rather than 256x240) of data that wraps. 16x16 tiles. Edit: I guess we're also assuming the data is stored such that byte 0 is top left, byte 1 is the tile below that rather than to the right of that as well, since that's the code I wrote.
)
Which byte is the tile at the top of the column data for any given position?
Code:
lda posx
and #%11110000
tay
lda tileram,y
The and is effectively a modulus, divide, and subtract of the scroll position all in one.
Suppose you have exactly two screens of data that wrap. Which byte is the tile at the top of the column data for any given position?
Code:
lda posxhigh
ror a;Which screen in RAM is now in the carry
lda posxlow
and #%11110000
tay
bcs screen2
lda tileram1,y
bcc byteloaded;This could be a jmp
screen2
lda tileram2,y
byteloaded:
If you just add one offscreen column in each direction, you can no longer take those shortcuts because there are 18 columns instead of 16 or 32. And obviously with 32x32 metatiles (or some other power of two thing) the math changes slightly, but it'll still remain bitwise in a way that you can't with 18 columns.
This is why dougeff and I don't need to shift data once it's in the array. You should absolutely find a way to not have do that. A lookup table might get you results slightly similar to the above if I'm understand correctly and the plan is a sliding window of 18 columns.
As far as whether to split the update across frames, that's up to you. If you don't have to move the array, you'll only ever need to write 16 bytes every frame, and that's assuming you scroll 16 pixels every frame. (I guess really 15 bytes assuming columns are 240 as is traditional.)
Edit2: I guess the thought to take with you if nothing else is that power of two data has a way of unlocking random access, because of how simple the math is. If your data is screens, you can still only load the next column. It's easy to find which screen (horizontally) because that's a power of two, and then it's easy to find a column within that screen. So you never need 16 reads and 16 writes for 16 columns. Just 15 reads and 15 writes for one column. (Assuming... you only scroll in one direction. Also assuming no compression beyond screens/metatiles.)
You normally don't have to shift the data to implement a sliding window like this, you just treat the structure that holds the data as a ring buffer. This is easier to implement with sizes that are powers of 2, because you can simply mask the upper bits when indexing the data and you're guaranteed to wrap around the ring properly, but it can be done with other sizes too, with look-up tables and/or comparisons.
What are your reasons for loading collision data into RAM?
It makes sense for Mario since collision data can be modified by breaking bricks, and the stage layout is loaded using individual structures and algorithms as far as I hear.
But if you store your levels in an easily unpackable format, I find it much easier to just look up collisions directly based on the object's world coordinates. Personally, I store collision data in its own table independently from the level data (allows me to create hidden passages, and invisible platforms etc.), and it doesn't take up a lot of space.
Sumez wrote:
But if you store your levels in an easily unpackable format, I find it much easier to just look up collisions directly based on the object's world coordinates.
That's my preferred method too. I don't like to impose a hard limit on how far from the screen the action can go (after all, there are certain game mechanics that require objects to be active even when far away from the camera). One problem though is that "easily unpackable formats" tend to not compress as well as other methods, or result in more repetitive game worlds, so not all kinds of games can afford that.
Quote:
Personally, I store collision data in its own table independently from the level data (allows me to create hidden passages, and invisible platforms etc.), and it doesn't take up a lot of space.
I personally prefer to have the collision data packed with the metatiles. If your collision data is any more complex than solid/empty (i.e. 1 bit per block), then yeah, the data can in fact occupy a lot of space (e.g. water, ice, breakable, hazards, slopes, solidity per side, etc.). I do have 2 sets of collision data per metatile though, so that levels can have 2 layers of solidity, allowing for loops and other crazy structures like those in Sonic games to be implemented using layer switchers. These same switchers could be used to make the same metatile behave differently in different places. I personally find it confusing when games have things that look the same but behave differently, so my games would probably have very little of that stuff, so I think I can afford a few "cloned" metatiles.
Yeah, you can basically save the collision data at whatever level of detail is most fitting for your engine. I'm using 2 bits per 16x16 metatile, so it does eat up a bit of data (eg. 1200 bytes for a large stage of 20 full "screens"), but until it becomes a concern, I don't see any reason to change it.
For my purpose it would definitely be more sensible to store collisions with my 32x32 sized tiles in the long run, but most people would just store it along with the 16x16 blocks. Either way it's super fast to look up for the engine.
Sumez wrote:
1200 bytes for a large stage of 20 full "screens"
Which's not large at all compared to a
Sonic level. Even Green Hill Zone act 1, the first level in the first game, is apparently 40x5 "screens" large, or 200 "screens" (many don't have any actual content though, so there's potential for optimization there). Yeah, I tend to compare everything to Sonic, because that's my reference when it comes to platformers, as that's the first one I ever played.
Sonic's stages are famous for being absurdly large though.
I'm talking a game comparable to something like Ninja Gaiden, where every step is a new challenge, and not just scenery that you zoom through
Of course this only enforces the obvious conclusion that how you do these kinds of things depends entirely on your game's design.
Curiously the Sonic games famously have a ton of secrets hidden by using tiles with collisions that are different than they appear. How does the game store these? Does it just have duplicates (ie. a wall tile with collision, and a wall tile without it)?
In regards to my own game, storing collisions with the existing 32x32 metatiles is one solution, but it's worth noting that collision maps are even more prone to repeat patterns than the background graphics, and I might just as well end up keeping it stored separately, but compressed on its own. I think that would save a lot more space in the long run.
Yeah - this is the method I use with Nametable data (loading next column, based on 64 possible columns...two nametables worth). I get the concept. But...ugh. I don't have the RAM space allocated for two collision tables...or rather, i sort of do, if properly load Collision tables as nibbles rather than full bytes...but that would mean a rewrite of a lot of collision detection, and the ability to use the upper bits as flags for each screen tile (used for various purposes module-pending...again, with this I need to think variable usage which tends to complicate things!). Hm.
And yeah, due to possible changing collision data (destructible terrain, animating tiles, etc) it really has to be loaded into RAM.
Some good things to think about, though, so thanks!
A game that scrolls only horizontally might use a circular buffer 32 metatiles wide by 12 to 15 metatiles high to cache the map for collision and partial updates. At one byte per 16x16-pixel metatile, this should take just shy of one-fourth of the CPU RAM.
A wholly different collision topology is metroid and project blue (sorry for all the self-referencing lately). Both are per tile-based in different ways. Under variants, some other ideas are mentioned.
In metroid, collision is implied by ranges of tile ID:s. I don't remember the exact ranges and properties, but say $00-$BF are solid, $C0-$EF are nonsolid, $F0-F8 are hurting nonsolid, and $F9-FF are breakable. Metroid is using most of its tilespace for solids.
Even in a scroller, detecting a hit is just the matter of checking screen position / 8 vs the NT cell tile ID (in the relevant nametable) every 8th pixel (you can AND position with #7 to branch past reduntant checks).
Project Blue differs in that it lets you set what tiles have what properties freely and combinable with the help of separate property arrays.
Pros:
-Easy/Requires no collision map sliver updates/collision is automatically derived from NT content.
-Promotes big worlds / levels
-Tile-granularity collision (not that metroid ever uses it - but the engine is perfectly capable to do so!)
-(valid for metroid) no significant ROM storage cost whatsoever
-(valid for project blue) relatively low ROM storage cost per level
-(valid for metroid) Doesn't need RAM.
-(valid for project blue) overlapping, combinable physics attributes
-(valid for project blue) no need to organize your tiles, other than for convenience.
Cons:
-If you plan on using the same tiles multiple time for solid/nonsolid, you either need to
a)waste some tile space with duplicates (this is the metroid method)
b)make collisions conditional on PPU attribute as well (this is part of the project blue method), which as a downside locks in the function of subpalettes pretty tight. Anyway, i've found that this is pretty natural and not much of a hindrance since you want to keep solids and far background differently coloured in a platformer anyway. An isometric game or the like would fare worse.
-(valid for metroid): you need to organize your tiles in accordance to the physics attribute ranges you've defined.
-(valid for metroid, but not necessarily for you): you may need different ratios of differently propertized tiles in different levels/stages/boards. Of course, you can just keep different definitions or routines for different levels.
Variants:
-Going with the metroid "range scheme", you can of course have overlapping ranges. So maybe there's a hurt property that overlaps with both solid and nonsolid.
-Something like project blue could instead bitpack 8 properties in a byte corresponding to each tileID. It would be free of RAM, small in ROM.
edited to be less confusing.
It's all very interesting to consider the various ways I could approach this now, with some thought. Thinking this out here....
Choice 1: Keep collision data as full bytes, still loading into RAM. Rearrange memory to try to free up RAM space for second collision table, and just mirror exactly how the nametable updates function, but on a metatile basis.
Pro: Probably how it SHOULD be done, as the infrastructure is already lain for when to update and column checking.
Con: Would have to sacrifice some of the RAM space allocated to many of the customizable things inherent. This might be hugely detrimental.
Choice 2: Use only 4 bit nibbles for each collision tile. That way, two screens worth of data could be packed in to the same space as one screen worth of full bytes. I could actually pretty easily determine which nametable is being looked at (2000 or 2400); if it's the former, read the first nibble, if it's the latter, read the second. That's a quick enough read to essentially behave the same way. And then on updates, it would blank that nibble and ora in new data, placing it in the proper nibble dependent on which of the two 'screens' is being updated.
Pro: Wouldn't have to reorganize RAM map at all.
Con: Any potential uses for those extra bits in real time game play would be gone, and the logic is a little bit more convoluted.
Choice 3: Shifting columns. Use a 16 byte buffer (this is clean, since the screen has 240 collision bytes...one ram page fits that + the buffer) for next potential row of collision. When threshold is reached, shift each column, column by column, filling the last with the buffer, and filling the buffer with the next potential column.
Pro: No reorganization of RAM map, OR loss of extra "real time" collision bits.
Con: Very complex relatively, and I'd imagine terribly slow.
Out of the three, I think the second may keep the current skeleton if things in tact the most and might be the simplest to implement. What do you guys think?
Sumez wrote:
What are your reasons for loading collision data into RAM?
In Lizard's case, there are several reasons. Here's 3 big ones I can think of immediately:
- Uncompressed 1-bit per tile collision data for the world would have been an extra 100k+ of data for the game (which is a pretty full 512k already).
- RAM means it can be modified (doors and platforms can be created and destroyed), though I wish I'd used this feature more.
- RAM also means I don't have to bankswitch that data in to do a collision test. (Was able to do interesting stuff like colliding snow particles partly because of this.)
There are some good reasons to use ROM collision tables too though. I'm certainly not saying that there's only one way to do it. Part of why it'd be so much data for Lizard is that it has 8x8 tiles instead of 16x16.
An advantage to Lizard though, is that as far as I recall, no room uses more than two nametables? So scrolling is less of an issue, if any at all.
Quote:
no room uses more than two nametables?
The river (surfing) and the frog boss scroll seamlessly.
I'd wager collisions also work quite differently in those two examples?
Yes, those two are special.
...but there's no fundamental incompatibility with RAM collision and scrolling, either, and compression is still a bonus there.
I don't know if I'm too late, but tepples gave me an excellent idea for scrolling horizontally using two 8bit variables
visible_left and
valid_left. After lots of my rereading, prayer, and efforts his method works excellent for me!!
tepples' visible_left valid_left scrolling advice (bottom of page 69)
So, trying some things. Give or take, got a few things *sort of* working the way I want. Graphically, it's all good (and has been). But collisions...oh collisions. Going to sort of do some public rubber duck debugging here....
Something simple enough like just determining which collision table to pull from (one loading for when NT 1 is showing, one loading for when NT 2 is showing) I think I'm overcomplicating terribly.
So, essentially, I have a columnTracker variable that is keeping track of the left side of the scroll camera (and in tandem what column to update in the opposite nametable). The player has about an 80 px padding area in the center (which can be adjusted by changing a constant) where the scroll doesn't kick in. So now I have to figure out what nametable the player "is in". Fairly simple, xScroll + playerX...if the carry is set, we're in the second nametable....
But wait, no. That's not right. If columnTracker is a value of 00-15 AND xScroll+playerX yields a set carry, we've moved into the second nametable (because that means the left of our camera window is still in the first screen). If columnTracker is 16-31 in the same condition, it means we've cycled back around to the first (because that means the left of our camera window is now in the second screen).
Alright, so now I could go three ways...keep track also of "object's" column, which would be different than his position (his position only returns values of 00-ff, but column could keep track of which column he's in and mirror the columnTracker function, just tracking the object, not the left of screen...), or I could create an overflow byte for position, sort of reading world coordinates, for which the last bit would determine whether collisions are in collision table one or two....or I could flip an arbitrary bit somewhere would an object could read to make that determination.
And in any case, I preemptively wonder how ugly collision detection will end up getting when I have to read left points of collision from one collision table, but right points of collision from the next.
These are my fleeting thoughts. You guys are awesome for your responses, as always.
JoeGtake2 wrote:
And in any case, I preemptively wonder how ugly collision detection will end up getting when I have to read left points of collision from one collision table, but right points of collision from the next.
It doesn't have to get super ugly. So far I use one function to check 4 points. Before calling that function I have my assembly code set
PointX,
PointY,
PointXX, and
PointYY to different values depending on the purpose for jsring the function (i.e. falling, left or right movement, etc.).
An outstanding Kasumi post explaining points checking. I also have a variable
FORWARD_last that holds the last left (#$00) or right (#$01) direction pressed. Then for left or right it looks something like this:
Code:
ldx FORWARD_last
beq +ei
ldx #xx ;a value for PointX and PointXX when pressing right
bne +; <jmp
+ei ldx #$xx ;a value for PointX and PointXX when pressing left
+ lda #01
ldy #30
stx PointX
sta PointY
stx PointXX
sty PointYY
jsr function that checks the points (see link to Kasumi's post)
;other code specific to my sister's game here
If you always calculate your character using its left side then PointX would be #$00 when pressing left and #(decimal value of your characters width when pressing right). That way, inside your points checking function, you can simply always add PointX to your character's X value, check the appropriate collision value, store the result, then always add PointXX to your character's X value, check, and store the result. Then include appropriate code that processes the results
that were stored inside of your points checking function. It's not ugly at all, to me at least.
p.s.
a.) I'm using #01 and #30 because my sister's game runs much better that way... even though Kasumi recommends using #00 and #31.
b.) To prevent my screen from flashing I was very blessed with an understanding that it is extremely important to always order groups of stores (i.e. stx PointX sta PointY stx PointXX sty PointYY) from lowest memory address to highest memory address.
edit: my main loop skips the
jsr LRfootCollision..., looks something like code section above, if input is not currently left or right so that use of
FORWARD_last will never be used unappropriately.
final edits.
Determining which nametable any given point is in is three instructions.
Code:
lda highxbyte
ror a
bcs nametable1
nametable0:
The scroll doesn't even need to factor into the decision. That's the beauty of have exactly 32 columns in the buffer.
The scroll only determines what's in the buffer/when the buffer gets updated. If you do have 32 columns, you have a buffer of 8 on each side of the screen.
Edit: Similarly, if the left and right points are in different nametables, if you're checking the left point, and you add the width and that sets the carry, now you're in the opposite nametable, otherwise you're still in the same one.
I usually don't need collisions for entities which are off-screen, so I use a plain 16x12 circular buffer (16x16 for vertical scrolling games).
Kasumi - I assume your xhi var is tracking a 16bit positioning, meaning even=table 1, odd=table 2. Am I understanding this right? If so that’s not exactly how my movement update/positioning code works. I have 16 bit positioning, but the low is just for holding finer speed adjustment. In this example, would right movement keep track of x + xScroll in a 16 bit variable, called posX? Something like this maybe?
And unregistered - yep...that’s pretty much exactly my method.
***EDIT***
I seem to have worked out a semi-passable method based on the influx of input...thanks everyone!
You're understanding that right. I really recommend adding that extra byte for each object for scrolling, I imagine it will save you a lot of trouble down the road.
If you have an active area wider than 256 pixels (and you really do need one if you scroll horizontally) you absolutely need 16-bit coordinates to keep track of positions properly. Anything else will be a hack, and probably not worth the 1-byte saved per object.