This is mostly going to be sprite related things, just so you know. This first thing I want to say isn't very amazing, I figured that if DMAing sprites is such a big issue, you could take 16 pixels of the top and 16 off the bottom, and you would get a bit more DMA bandwidth (I don't remember the formula, so I can't tell you how much) and if your TV is able to vertically stretch the display, you can view the game at 4:3 without cropping out any of the picture. Second and more importantly, I know sprites only being able to access 16KB of ram is a bit of a problem so now that it has been discovered, you could have 3 sprite banks with the 2nd one being shared by 1 and 3. 224 - 32 = 192, and 192 / 64 = 3, and there are 3 sprite banks, so you could have it divide the display nicely. You would have the first half of the screen use banks 1 and 2, and the second half of the screen use 2 and 3. Here's a picture to show what I mean.
Attachment:
Example.png [ 567 Bytes | Viewed 3973 times ]
The only problem is that I'm not exactly sure how to change the position for what tiles a sprite is using. (If it were located in hi oam, this would be a heck of a lot easier...) You would have to find to see if there are spots in both sprite banks (1 and 2 or 2 and 3) that are open that correspond to each other, so there is a seamless transition between the top and bottom parts of the screen. If you are going to animate a sprite at the top third of the screen, you look for a spot in the 1st sprite bank and if it is empty, the second and if it is in the middle, you look in the 2nd sprite bank and then the 1st or the 3rd depending on if you are near the top or the bottom, and on the third, you look at the 3rd bank, and then the second. I hope what I said even makes the slightest bit of sense.
Espozo wrote:
if your TV is able to vertically stretch the display
I assume you mean "the input is letterboxed, remove the letterboxing" with a proportional scaling. Tepples has
pointed out in the past that the SNES/NES could have an active area of ≈256x164 to be 16:9 friendly.
That said, 160 pixels is kinda painfully low.
Is this a game that you are working on? What's the idea for?
Espozo wrote:
you could take 16 pixels of the top and 16 off the bottom, and you would get a bit more DMA bandwidth (I don't remember the formula, so I can't tell you how much)
You'd almost double it, at least on an NTSC deck. The approximate formula is (262-N)*165.5, where N is the number of scanlines in active display. The exact value depends on when your interrupt fires and what else is going on during VBlank.
The formula comes from the fact that a frame on an NTSC SNES is 262 scanlines, most scanlines are 341 dots, each dot is four master cycles, and DMA goes at eight master cycles per byte except during the 40-cycle DRAM refresh in the middle of each scanline. So you get ([341*4]-40)/8 bytes per scanline.
PAL is 312 scanlines per frame, but everything else is the same. DMA bandwidth is usually not an issue with PAL...
Espozo wrote:
you could have 3 sprite banks with the 2nd one being shared by 1 and 3. 224 - 32 = 192, and 192 / 64 = 3, and there are 3 sprite banks, so you could have it divide the display nicely. You would have the first half of the screen use banks 1 and 2, and the second half of the screen use 2 and 3. Here's a picture to show what I mean.
[...]
The only problem is that I'm not exactly sure how to change the position for what tiles a sprite is using
I don't think what your attempting is possible. According to Anomie's document
OBSEL can only be written during V-Blank or Force-Blank. I remember reading somewhere that it might be possible to modify the OAM Base Address mid frame on an 1-Chip SNES. I do remember it is not possible on my old first gen PAL SNES (or maybe I was doing it wrong, I don't have it anymore).
Espozo wrote:
I figured that if DMAing sprites is such a big issue, you could take 16 pixels of the top and 16 off the bottom, and you would get a bit more DMA bandwidth.
Most games cope with the limited DMA bandwidth by either:
- Only DMAing the Player Character
- Delaying a frame change if the DMA buffer is full (Secret of Mana)
- Alternate which sprites have their frames updated in a circular buffer (RARE Games).
Donkey Kong Country uses a circular buffer to allocate 16 CGRAM colours and 2 VRAM rows (32 8x8 tiles) for each entity on the screen. The animation routine processes 2 entities per frame[1], requiring only 2048 bytes[2] to be transferred to the OAM tiles per frame.
If you need bigger characters, Killer Instinct (NTSC) uses a similar method but allocates 6 VRAM rows per entity, and only processes 1 character animation per frame. However the metasprite-animation-frames rarely requires more than 4 VRAM rows for its characters.
We kinda need to know how big the characters are and how many you expect to see on the screen before I can help anymore.
---
[1] With a limit of 7 characters on the screen at any given time (1 palette is reserved for the bananas and GUI) we get a maximum of 15 FPS (60 / 4) for sprite animation.
[2] Each animation frame consists of a MetaSprite Table, 2 DMA transfers (address and size) for the two VRAM rows and an index to the next frame.
Transferring sprite graphics should not be much of an issue actually, because it's not that common to have all or even most of them updating simultaneously (remember, the same graphic will be shown for several consecutive frames), and this is without even factoring in sharing (e.g. several beat'em-ups just have a single kind of enemy at any given moment, which lets them just store every frame for those enemies).
The real problem will be memory usage. How many tiles do you think you'll need? Account in both players and the maximum amount of enemies that could possibly show up on screen at the same time, and remember you'll need some room for items and scenery as well.
As for chopping 16 pixels off the top and bottom: it's not as bad as it sounds, especially since part of that is already off-screen in the first place - it's more like losing a tile row than two. The resolution you end up with is 256×192, and your vblank bandwidth in NTSC is nearly doubled. I'd avoid it if possible, but if you run out of bandwidth and can't work around it, that may be an option. If your HUD is opaque, you could even pass some of the clipping as part of the HUD background.
UnDisbeliever wrote:
Espozo wrote:
you could have 3 sprite banks with the 2nd one being shared by 1 and 3. 224 - 32 = 192, and 192 / 64 = 3, and there are 3 sprite banks, so you could have it divide the display nicely. You would have the first half of the screen use banks 1 and 2, and the second half of the screen use 2 and 3.
I don't think what your attempting is possible. According to Anomie's document
OBSEL can only be written during V-Blank or Force-Blank. I remember reading somewhere that it might be possible to modify the OAM Base Address mid frame on an 1-Chip SNES. I do remember it is not possible on my old first gen PAL SNES (or maybe I was doing it wrong, I don't have it anymore).
It works on my NTSC SNES, and on every emulator I tried from ZSNES to bsnes (except the accuracy cores; bsnes v072 accuracy reverses the colours on the top scanline, and higan v094 accuracy reverses them everywhere else):
viewtopic.php?f=12&t=12305&start=30#p141770Anomie doesn't list something as possible when he's not sure. And to be fair it isn't obvious what should happen if the size setting changes during a frame - I only tried changing the name base.
Does this work on anyone else's machine? Mine's a replacement, and I haven't checked the version; I assumed all the large-form-factor units worked pretty much the same.
EDIT: higan v094
any core reverses the colours. WTH?
UnDisbeliever wrote:
We kinda need to know how big the characters are and how many you expect to see on the screen before I can help anymore.
Sik wrote:
The real problem will be memory usage. How many tiles do you think you'll need? Account in both players and the maximum amount of enemies that could possibly show up on screen at the same time, and remember you'll need some room for items and scenery as well.
I'm think the average size for characters would be about 48x80, give or take a bit.
Attachment:
Untitled.png [ 572 Bytes | Viewed 3875 times ]
I think 2 players with 4 enemies would be all right. (and maybe an oil drum or something, but I'll really risking problems with overdraw.) The heads for the characters should be thinner, so if all the characters are on a line together, there will be some of what I like to call "the cheese grater" (don't ask) but if they are at positions slightly different from each other, it should be all right. I really wanted to have it to where halfway down the screen, I would change a sprites character bank to where 2 is changed to 1 so sprites in the middle of the screen wouldn't need to have character data in 2 different positions, which would cut DMA bandwidth in half. I'm not sure how worth this would be, because you cannot change a sprites last character bit during h blank. (I would have much preferred it if the character bit was in hi oam and the x bit wasn't.)
lidnariq wrote:
the SNES/NES could have an active area of ≈256x164 to be 16:9 friendly.
That said, 160 pixels is kinda painfully low.
It wasn't painful on the Game Boy Advance, which had 240x160 square pixels. And even if you find it painful, might this be one of the cases where interlaced mode would help? That'd give 256x320 pixels for sprites, though they would be really wide (16:7) pixels.
The GBA also had a small screen.
I thought DKC had up to 14 objects on-screen, and 4 objects animated per frame. Maybe they improved it for DKC2.
psycopathicteen wrote:
I thought DKC had up to 14 objects on-screen, and 4 objects animated per frame. Maybe they improved it for DKC2.
Agreed. 7 seems awfully low... Do things like smoke clouds and debris from barrels breaking count toward the limit? If so, I definitely know 7 is not right.
The case has been solved. DKC does indeed support more than 7 objects. (but doesn't run very fast... I wonder what the main thing is that's causing it to go so slow.) I edited the rom with the Donkey Kong Country Resource Editor by Simion 32.
Attachment:
Espozo wrote:
Agreed. 7 seems awfully low... Do things like smoke clouds and debris from barrels breaking count toward the limit? If so, I definitely know 7 is not right.
I'm sure there is rooms with dozens of bananas on screen simultaneously. They don't move, but are animated and collides with the player.
Bregalad wrote:
Espozo wrote:
Agreed. 7 seems awfully low... Do things like smoke clouds and debris from barrels breaking count toward the limit? If so, I definitely know 7 is not right.
I'm sure there is rooms with dozens of bananas on screen simultaneously. They don't move, but are animated and collides with the player.
In DKC, bananas don't count as objects. They count as their own separate thing, except for banana bunches, which do count as objects. It's neat, because there is a place in the game where you can make a pattern of bananas on a grid, and you can just place the pattern of bananas in the level instead of making a bunch 1 by 1. I think a whole banana grid only counts as 1 banana toward the banana memory limit for the level.
The bonus levels of DKC1 are far more "banana-" [actually, a substitute: miniature animal-buddy statue-] intensive, so one grid = one object makes sense. (DYDDY on save screen to have easy access).
So are bananas essentially treated like coins in the Super Mario Bros. games?
I'm suprised you can't change the sprite vram bank address during hblank.
psycopathicteen wrote:
I'm suprised you can't change the sprite vram bank address during hblank.
Like I said, you can on my SNES, unless I've gravely misunderstood something.
Anyone who wants to can
download my test code and try it.
PLEASE DO, as this is critical information for my game, and I can't tell for sure if my console is a 1CHIP or not...
...it's got a printed EJECT label, a molded FCC label, and two feet on a heavily yellowed bottom half; there's no locking tab hooked to the power button but there is a warning sticker, the gap beside the expansion port shows two dots and the edge of a ring on a green background, and the serial number is UN231084565...
...
Also, if you think this test is a softball - ie: that it could be giving me a false positive - please speak up. I have limited hobby time, but if a better test is necessary, so be it. I need to know this.
Sorry if I was giving misleading information, but the
banana bunches count as objects, where the
banana groups and bananas singular don't count toward the object limit at all.
This is a banana bunch:
Whereas a banana group would be like if a bunch of singular floating bananas were arranged into a "3" shape or something. When I was talking about the banana limit, I was talking about the entire level, (which I think I remember changes level to level) not just how many bananas appear on screen. Like I said, I think a banana group only counts as one banana toward the level limit, so I'd imagine it's the same way for what's on screen. I remember I was goofing off and I filled an entire banana grid and filled the whole screen with them, and not all of them would appear (possibly sprite limit) or even be able to interact with the player, so they really didn't even spawn.
93143 wrote:
Anyone who wants to can
download my test code and try it.
PLEASE DO, as this is critical information for my game, and I can't tell for sure if my console is a 1CHIP or not...
I've tested this on two SNES consoles (I can open them up to verify PCB model if needed, but I believe neither are 1CHIP): a UN24xx ad a UN25xx. ROM was run via an SD2SNES adapter. I'm not sure what exactly I'm looking for, other than a sprite changing colour (as it moves down the screen), gradually from red to green, starting at a specific screen scanline. I see no other behaviour and no oddities, no matter where I place the sprite.
P.S. -- I'm happy to alpha/beta test anything you need, just lemme know in a thread somewhere or drop me a PM if you need for something to remain private (incl. special ROM builds, etc.).
koitsu wrote:
93143 wrote:
Anyone who wants to can
download my test code and try it.
PLEASE DO, as this is critical information for my game, and I can't tell for sure if my console is a 1CHIP or not...
I've tested this on two SNES consoles (I can open them up to verify PCB model if needed, but I believe neither are 1CHIP): a UN24xx ad a UN25xx. ROM was run via an SD2SNES adapter. I'm not sure what exactly I'm looking for, other than a sprite changing colour (as it moves down the screen), gradually from red to green, starting at a specific screen scanline. I see no other behaviour and no oddities, no matter where I place the sprite.
That's exactly what's supposed to happen, and if it worked on two consoles, great!
(If you open it with a vram viewer, you can see 2 different colored squares, so it isn't a palette swap.) Do you know what version your consoles are? You can see if you have the SNES Lion King game. (There's some sort of secret menu, but I don't remember how to access it. It should be easy to look up.)
I put it in my SNES PowerPak on a launch console (CPU:1 PPU1:1 PPU2:1), and it showed a red square that turns into a half red and half green square and then a green square as I move it up and down.
From
The Lion King on TCRF: Open the options menu and press the B, A, R, R, Y buttons.
The red/green sprite thingy works on my SFC-TV.
Red on top half, green on bottom half of screen.
1 : 1 : 1 USA Launch Console.
Thanks for all the help! This is looking very promising.
By way of explanation: the first entry in OAM is the only one on the screen, and it uses palette #0. There are two sprite tiles in VRAM; a red one (referencing palette #0) at $0000 and a green one at $2000. An HDMA channel is set to write to OBSEL; at the top of the screen it writes $00, and halfway down it writes $01. Only the bottom two bytes in OAM are ever changed, and at no point is CGRAM touched. The idea is that when the square turns green, it is switching from the primary sprite table being at $0000 to the primary sprite table being at $2000.
So I'm not testing the behaviour of the second table's offset, just the first table's position. I should probably tweak it to test the offset too. But it's still encouraging that the name base seems to behave as expected on multiple revisions - if I had to guess, I'd say both my scheme and Espozo's idea should work.
Does anyone see any holes in my test protocol, other than what I just noted?
...
It does still bug me that higan doesn't work... have we really stumbled onto something ZSNES gets right and bsnes doesn't?
...
Re: Lion King - if I'm not mistaken, a 1CHIP will still report itself as a 2/1/3. There are physical telltales, but I don't know of a way to find out for sure without taking it apart.
93143 wrote:
It does still bug me that higan doesn't work... have we really stumbled onto something ZSNES gets right and bsnes doesn't?
I have a version of BSNES that said it was made on 6/26/2010 and it works, but it doesn't give a product version when I look under "properties".
I have bsnes v072 (10/25/2010) and higan v094 (the latest, 1/20/2014). The former works (except for the top scanline in accuracy core); the latter doesn't (it behaves exactly like v072 but with the colours flipped).
93143 wrote:
I should probably tweak it to test the offset too.
...well, that didn't take very long. I should have just done it earlier.
New feature - pressing one of the four face buttons will switch sprite #0 between the main and secondary sprite tables, by writing the appropriate value to OAM during VBlank. The HDMA now sets OBSEL to $08 at the top of the screen, meaning it picks up the green square at $2000 when it's using the secondary table above the split. But it still sets it to $01 halfway down, which means it picks up a yellow square at $3000 when using the secondary table below the split.
Works on my SNES. And this time it works in higan too (
except for that top scanline in the accuracy core)...
EDIT: As might have been expected, the mid-screen split also happens one scanline lower in the accuracy core. So it's a timing issue...? Does higan cache the sprite graphics one scanline early as well as the OAM data? I just stared at my TV at close range for a bit, and it looks like there are indeed 8 scanlines worth of red tile when the program first starts up...
On the subject of fiddling with stuff midframe, has anybody actually tried to update the OAM during hblank? I know the oam address gets left at the hioam byte of the last sprite rendered, but can you manually change the oam address to where you want it? Sorry if this has already been tested, I am just frustrated with the way SNES sprites are laid out. I just want to use a huge load of 8x8s, like an NES on steroids.
psycopathicteen wrote:
I just want to use a huge load of 8x8s, like an NES on steroids.
I agree with you 1,000% The main thing I don't like is how a 16x16 object with only 8x16 of the object filled out will count 16 pixels instead of 8 to the sprite pixel per scan line limit. People always complain about how "slow" the SNES's CPU is (the people who do don't even know what a bit is. They just compare 7.6 with 3.58, and if both processors were exactly the same, the Genesis's CPU would blow the SNES's out of the water, but they aren't) but the sprite pixel per scan line limit is far greater to me. (I'd gladly get rid of BG 3 and transparency and mosaic and window layers and the useless 64x64 sprites and hi res mode and all that jazz for 2x the amount of overdraw any day of the week)
psycopathicteen wrote:
I know the oam address gets left at the hioam byte of the last sprite rendered, but can you manually change the oam address to where you want it?
You mean kind of like say you try to store a number at $0010, but it instead stores it at $0020, so you store it at $0000, knowing it will turn into $0010? I've wondered the same thing.
(You know, maybe we could get our good buddy 93143 to "help" us.
)
Okay, so... I'm pretty sure it's possible to change sprite table locations mid-screen via OBSEL with zero artifacting regardless of how many sprites are onscreen or where they are. Just not the way I was doing it before...
The trick is apparently to do the write during active display, when the sprite system is busy with OAM, rather than during HBlank when VRAM fetch is going on.
First Try changes OBSEL with HDMA, which is probably why it's glitchy. In other words, my earlier attempt
was a softball, and only worked because I wasn't loading it down heavily. (Fortunately the actual application is a situation where I've already used up all the HDMA channels and need to use an IRQ anyway...)
I wrote a test program that loads up the whole screen with unique sprites, plus an additional 16x16 (actually a quad of 8x8s - long story) that can be moved around by the user and changes colour and shape halfway down due to the table location change. No glitches, whether the OBSEL write happens fairly early in the scanline or very late. I hacked it up to have 32 sprites per line in the area of the switch, and a Mode 7 layer mathed with the whole screen. Still no glitches (though you wouldn't know it, looking at the mess I made of the display).
Attachment:
spritetest2.7z [96.08 KiB]
Downloaded 87 times
There are two programs: "spritetest2" is the nice one, and "spritetest2s" is the hacked-up one. The hacked-up one has two 8x8 sprites deliberately missing from each of the two rows bracketing the switch, so as to allow the movable shape to be visible without clipping, and below those rows there are no sprites. I'd make the hacked-up one look nice too, but I've spent too long on this already...
EDIT: Just so it's clear, the name "spritetest2" never appeared in my internal workflow before now. The fact that the earlier test has the same name is an accident; I must have changed it for the forum, to distinguish it from the first version of "spritetest"...
Espozo wrote:
(You know, maybe we could get our good buddy 93143 to "help" us.
)
Maybe someday. I only did this because it was easy, and I had a bit of breathing room...
That is a really nice find
I already try to do that on Megadrive to bypass the 80 sprite limit per screen (in case we use many small sprites) and in fact it just doesn't work. Relocating the Sprite table during display is not enough to get the sprites correctly updated on screen. The problem is that the MD VDP has an internal cache which store a part of the sprite table and this cache is updated only on VRAM write operation (the VDP detect you are writing to the Sprite Table area and update the internal cache according). So if you just change the sprite table VRAM address without actually rewriting the Sprite Table itself the VDP internal cache is not updated, also that lead to weird behavior as some part of the sprite attribute are always fetched from VRAM while other are read from internal cache (Castlevania Bloodline uses that behavior in water level for sprite reflection effect).
I think he's talking about switching pattern banks midscreen, not doing sprite multiplexing.
Yeah, I think there may be a terminology issue here.
On MD, the "sprite table" is presumably the SAT, equivalent to OAM on the SNES. OAM is not in VRAM; it's separate memory, so you can't change where it is. Currently no one knows if it's possible to multiplex sprites during a frame on SNES, but if you can it's probably very difficult because of how the internal OAM addressing works.
What I meant by "sprite table" in the context of the SNES is the OBJ data, equivalent to... well, VRAM on the MD. One of the major issues with the SNES sprite system is the 16 KB sprite graphics data limit imposed by the fact that there are only two 8 KB OBJ tables available. What I have just shown is that it is possible to bust that limit under certain circumstances, by changing which two 8 KB areas of VRAM are considered sprite data partway through a frame. Unfortunately each scanline is still limited as before, which means this works best when the vertical position of the sprites is well controlled, such as when they are being used as a second BG in Mode 7, or maybe in a fighting game with very large characters (you can set it up so that either table can be relocated while leaving the other where it is).
So yeah - I haven't (yet?) figured out how to have more than 128 sprites on SNES without rendering them down on a coprocessor like the Yoshi's Island title screen does (or on the S-CPU like psycopathicteen's bullet hell tests do). I've just managed to allow the PPU to access more than 16 KB of graphics data for sprites in a single frame.
"Sprite VRAM bank switching" might have expressed this intent better.
Code:
7654 3210 OBSEL ($2101): Sprite size and bank select
|||| |+++- Offset from first to second bank ($1000, $2000, ..., $8000)
|||+-+---- Address of first 8K sprite bank ($0000, $2000, $4000, or $6000)
+++------- Sprite sizes (useful values: 0 for 8/16 and 3 for 16/32)
Super NES VRAM can be thought of as divided into eight banks, each $1000 words (8 KiB) in size. Each bank can hold up to 256 16-color tiles. The S-PPU's sprite unit has two windows, which can be set to one of the eight banks. (The first window can use only even banks.) Bit 0 of byte 3 of each OAM entry determines whether the first or second is used:
Code:
7654 3210 OAM[4x+3]: Sprite attributes
|||| |||+- Sprite tile window to use
|||| +++-- Palette to use ($80, $90, $A0, ..., $F0)
|||| Only $C0-$F0 are eligible for blending
||++------ Priority (3 in front)
|+-------- Horizontal flip
+--------- Vertical flip
A game can use the first window for sprites used throughout the screen and the second for things limited to the top or the bottom of the screen. This may require careful layout of the screen, not unlike the division into horizontal strips that Atari 2600 games rely on.
For comparison, the Genesis VDP drops one priority bit and one palette bit to gain two more tile number bits, allowing use of all eight banks:
Code:
7654 3210 SAT[8x+4]: Sprite attributes
|||| |+++- Sprite tile bank
|||| +---- Horizontal flip
|||+------ Vertical flip
|++------- Palette to use
+--------- Priority (1 in front)
93143 wrote:
On MD, the "sprite table" is presumably the SAT, equivalent to OAM on the SNES.
Yeah, SAT = Sprite Attribute Table. But people always drop the "Attribute" part.
It still looks like it would be crazy complicated to juggle sprites between pages.
psycopathicteen wrote:
It still looks like it would be crazy complicated to juggle sprites between pages.
Depends what you're doing, I guess. I'm using this technique for my shmup port, because a rendered layer over a Mode 7 backdrop eats a lot of OBJ memory. Sprite paging is relatively simple to handle in this case, since the rendered layer never moves, and the actual game sprites are mostly small and easy to hardcode.
It seems to me that a fighting game could be a plausible application too, as long as non-player sprites like projectiles are either
locked to the player's vertical position or small enough to duplicate between tables without killing the space available for BG data.
I can see how this Idea might be complicated to make use of in a Beat Em Up Game Engine, though, or a platformer or a run 'n' gun or something like that...
Oh yeah sorry i misunderstood, only the sprite tiles bank is switched here... i guess that sentence "it's possible to change sprite table locations mid-screen via OBSEL with zero artifacting" got me wrong because i'm too used about how Megadrive works. Indeed on SNES OAM is outside VRAM and cannot be relocated. Still i think that surpassing that 8/16 KB limit is interesting depending the game situation... but bypassing the OAM limit would be even better i believe
What i tried to do on MD is to have 2 SAT (Sprite Attribut Table or OAM for SNES), one for all sprites with Y position between [-31...128] and a second SAT for sprites with Y pos between [96...223] (need 32 pixels overlapping so sprites between [96...127] are duplicated). That way i could get more than 80 sprites (the more you split the SAT, the more you can get) but unfortunately my method couldn't work :-/ I guess updating the OAM during active display on SNES is even tricker than doing it on MD (from what i read OAM could not be write at all during active display or you get corrupted result).
On MD I found a simple trick to display 128 sprites (work both in H32 and H40 mode). You just need to arrange the sprite in SAT so they are linked somehow by Y coordinate (of course still respecting inter sprite priority correctly) so you just need to modify the sprite #0 link field during active display to bypass already rendered sprites. You can modify it once or twice per frame if you really want to exploit the maximum of 128 sprites. This work as internally the link field is 7 bits so the SAT size is 128 entries even if VDP cannot parse more than 80 (linked) entries per scanline.
Imagine a separate SAT for every 16 scanlines.
"Holey DMA, Batman! That sounds like an Atari 7800!"
tepples wrote:
Imagine a separate SAT for every 16 scanlines.
I once considered porting
Illumination Laser and legitimately thought about this >.> The idea would have been to reserve 16 entries to use for bullets, then every 16 scanlines this part of the SAT would be rewritten (since I'd only change the Y coordinate as all bullets would be 16×16 and link would remain the same as well, multiplexing would take up only like 3 scanlines, overhead included). Then at least there'd be less risk of running out of sprites when the game goes bullet happy. I'm not sure how feasible the extra complexity is in practice though =/
I just read an article about Cho Ren Sha for the Sharp x68000, and according to the developer, he used both sprite multiplexing, and dynamic animation, to have up to 512 sprites onscreen, and have unlimited animation frames. I'm not exactly sure what type of dynamic animation he used, but it sounded like he checked every individual 16x16 sprite if it was in VRAM or not, and looked for new slots when an existing slot was not found. I don't know if there were any memory bandwidth limitations for the x68000. It appears that sprites on the x68000 use a separate memory chip, so it might use an alternating bus system between the CPU and sprite chip.
psycopathicteen wrote:
I'm not exactly sure what type of dynamic animation he used, but it sounded like he checked every individual 16x16 sprite if it was in VRAM or not, and looked for new slots when an existing slot was not found.
psycopathicteen wrote:
I just read an article about Cho Ren Sha for the Sharp x68000, and according to the developer, he used both sprite multiplexing, and dynamic animation, to have up to 512 sprites onscreen, and have unlimited animation frames.
Yeah, 128 sprites per quarter screen, and in fact Illumination Laser is using the same code (a large part of the reason it slows down is because it's an amalgamation of tons of 3rd party libraries glued together lol, lots of syscalls going on)
psycopathicteen wrote:
I don't know if there were any memory bandwidth limitations for the x68000.
I couldn't find any sort of information on DMA to transfer data into sprite memory. Now, I may have missed something (I don't really know how to program the X68000 =P), but if that's indeed the case then it'd mean the CPU has to stream the graphics on its own, which is probably a cap in itself (although later X68k models use faster CPUs that don't waste as many cycles when doing bus accesses, and as I said this game is quite prone to slow down)
psycopathicteen wrote:
It appears that sprites on the x68000 use a separate memory chip
There are 32KB reserved just for sprite graphics, yeah.
Espozo wrote:
psycopathicteen wrote:
I'm not exactly sure what type of dynamic animation he used, but it sounded like he checked every individual 16x16 sprite if it was in VRAM or not, and looked for new slots when an existing slot was not found.
16×16 is literally the only sprite size on the X68000, so that probably makes things simpler =P
If it allows VRAM access anytime, it has the benefit of not having to delay an animation when the current frame is too DMA busy.
The Sega Genesis VDP's background rendering access pattern offers a few time slots for the CPU to access VRAM during the scanline. VRAM can be written at any time, but more than four writes in quick succession outside vertical or forced blanking will block the CPU until the next slot, causing slowdown. Does X68000's video chip have the same time slot limit?
In addition, if your sprite cels are compressed, you might need to delay an animation until all 16x16 pixel tiles in the cel have been decompressed.
Stef wrote:
I guess updating the OAM during active display on SNES is even tricker than doing it on MD (from what i read OAM could not be write at all during active display or you get corrupted result).
...yeah, if I've understood correctly, writing OAM during an active line would require ridiculously precise timing, since it's not dual-ported and there's no FIFO or anything, so the write goes to wherever the PPU is looking at the moment*. According to
nocash, behaviour during HBlank is unknown and may depend on how many sprites/tiles are on the scanline being prepared. According to
byuu, accesses during HBlank go to the HiOAM byte corresponding to the last sprite on the line, and rewriting the address doesn't help. Uniracers actually does this - possibly to prevent sprites from spilling past the screen split in 2-player mode, but I don't think anyone knows for sure...
I may poke at this problem a bit in the near future. Turns out it's actually quite easy to get the SNES to display the contents of OAM, which should help me figure out what's going on.
*
The same is true of CGRAM. I managed to exploit this in my DMA direct colour demo, but unfortunately it's nowhere near as useful as the MD version because (a) the SNES has better colour depth normally, so the improvement isn't as noticeable, and (b) the 8-bit DMA cuts the resolution in half, to 64 "pixels" per line. It looks like a really high-colour Atari 2600 game (and when you think about it, it works a fair bit like one too)... on the flip side, since the trick doesn't involve forced blank, BG3 and sprites should still work, as long as they're kept clear of whichever screen the loading pattern is on (main or sub depending on the results of a DMA/PPU alignment test)...Attachment:
dmac_oam.sfc [64 KiB]
Downloaded 100 times
Quote:
On MD I found a simple trick to display 128 sprites (work both in H32 and H40 mode). You just need to arrange the sprite in SAT so they are linked somehow by Y coordinate (of course still respecting inter sprite priority correctly) so you just need to modify the sprite #0 link field during active display to bypass already rendered sprites. You can modify it once or twice per frame if you really want to exploit the maximum of 128 sprites. This work as internally the link field is 7 bits so the SAT size is 128 entries even if VDP cannot parse more than 80 (linked) entries per scanline.
Sneaky...
Yeah i remember about those DMA direct color demo
Nice accomplishment, to be honest i didn't expected to see it on SNES. The result is a bit less interesting because of the resolution but just for the fun it's cool to see it on SNES as well :p And honestly even on MD it's not really useful as you don't even have enough memory to display a complete screen... It's best to use Sega-CD for full bitmap mode or half screen on Megadrive.
Stef wrote:
Nice accomplishment, to be honest i didn't expected to see it on SNES.
Thanks! I didn't either, but then byuu said he'd "never once seen a CGRAM write fail", that it would just go to the wrong address during rendering, and it got me thinking.
Sprite multiplexing is of course a very different case. Not to mention that it would be really nice if it were useful as a drop-in enhancement, as your link field trick sounds like it is, rather than just a heavy duty tech demo that requires the rest of the system to stay out of the way. I guess I'll see what I can figure out next time I have the time and inclination...
Actually now that I think on it, Stef's idea wouldn't work because the sprite cache can't hold more than 80 sprites so any attempt to link beyond that will result in complete garbage (the VDP tries to use noise instead).
Also, if somebody figures out how to do multiplexed sprites on the SNES, there would be no way to control sprite priorities.
How so? Objects get drawn in order, (or at least for me they do) and each section will have its own counter for how many sprites are being drawn.
What if you have sprites A, B and C, and you want A to be in front of B, in front of C, but A and C are placed directly below a region boundary. How would this work?
Where is B? I'm not getting this.
If B is a little higher than A and C but still overlapping, you'd either need to write A and C around B, or include a second copy of B.
psycopathicteen wrote:
or include a second copy of B.
That's what I would have thought you'd do...
That's exactly what you'd do. On Atari 7800, you write a sprite to the display list of all 16-pixel-tall zones it crosses.
Sik wrote:
Actually now that I think on it, Stef's idea wouldn't work because the sprite cache can't hold more than 80 sprites so any attempt to link beyond that will result in complete garbage (the VDP tries to use noise instead).
To be honest i never tested the idea (link change mid frame) on the real hardware but are you sure the internal cache size is only 80 sprites long ? I'm almost certain i remember games (as Contra Hardcorp) having sprites located in slots #80 to #127 and could display them short-cuting with link field. In which case it means the sprite cache can actually own the 128 entries...
Note to myself: test that on real hardware ^^
Yeah, we checked that during Overdrive's development... Sorry =/ (at first it looks like it sorta works because the VDP doesn't apply any boundaries so it reads high impedance garbage which usually returns back the same old values, but it becomes pretty obvious it's broken not long after)
Arg what a shame :'( The fact it reads back old values for sometime before reading garbage can be even more confusing to understand what happen...
tepples wrote:
That's exactly what you'd do. On Atari 7800, you write a sprite to the display list of all 16-pixel-tall zones it crosses.
What if there is a sprite D that gets cut off between the first DMA set and the second DMA set?
The zone list on the 7800 is a software-defined counterpart to the sprite evaluation in the NES PPU or any other TMS9918-style VDP. Because it's software-defined, it's not as efficient, but it's more flexible. For example, if you want sprite D to show up in one zone but not the adjacent one, such as if you're making a split screen or a "behind the background" type thing, then don't write the sprite to the other zone.
Re: SNES, if you can guarantee that there are no sprites on a particular line, you should be able to just force blank during the HBlank preceding that line and go nuts (unless there's a substantial delay before the OAM address becomes usable - I haven't tested this). Obviously that's not representative of the general case...
tepples wrote:
The zone list on the 7800 is a software-defined counterpart to the sprite evaluation in the NES PPU or any other TMS9918-style VDP. Because it's software-defined, it's not as efficient, but it's more flexible. For example, if you want sprite D to show up in one zone but not the adjacent one, such as if you're making a split screen or a "behind the background" type thing, then don't write the sprite to the other zone.
But between the "zones" you need several lines to fill up the OAM, which must be arranged carefully to avoid flickering and priority issues in the transitional region.
That is, unless you can write to oam before hblank starts.
The 7800 has 4K of unified memory. The zone list and the display lists it points to are inside main RAM, and MARIA (the video chip) pauses the CPU (by pulling RDY low) while it fetches display lists and patterns. CHR can be pointed at RAM or ROM.
But on the SNES, you'd probably would only be able to update 8 sprites per scanline, and you'd probably want at least 16 per zone.
lidnariq wrote:
Espozo wrote:
if your TV is able to vertically stretch the display
I assume you mean "the input is letterboxed, remove the letterboxing" with a proportional scaling. Tepples has
pointed out in the past that the SNES/NES could have an active area of ≈256x164 to be 16:9 friendly.
That said, 160 pixels is kinda painfully low.
With 192 of vertical resolution you will have 5KB extra of bandwidth. In total, almost 11KB of data transfer (10,89KB), plus the bit you would have compressing (let's say a 20%, that is 2,17KB).
In total is 13,06KB. The resolution is sacrificeable if you don't shall to use all the surface of the screen (usually you have big zones of sky where nothing happens), and conversely you will have a true level of amazing animations.
Edit:
Just a question...
If i'm transfering tiles from ROM to VRAM, Can the DMA transfer tiles from WRAM to VRAM simultaneously?.
In this way maybe could be use the WRAM like a kind of caché for backgrounds animations (people walking, things exploding, etc).
Señor Ventura wrote:
the bit you would have compressing (let's say a 20%, that is 2,17KB).
You can't send compressed graphics to VRAM, so you're stuck with 11KB. This is really more than enough though, sprites only have access to 16KB of VRAM and they're going to be what you're updating the most, and probably somewhere around 30fps, so you could probably get away with a vertical resolution of 208 which would give you 8.5KB per frame. I don't know what animation scheme you want to use, but I imagine you're not updating all of the VRAM available to sprites anyway.
Señor Ventura wrote:
If i'm transfering tiles from ROM to VRAM, Can the DMA transfer tiles from WRAM to VRAM simultaneously?.
No. DMA transfers happen one at a time.
Señor Ventura wrote:
In this way maybe could be use the WRAM like a kind of caché for backgrounds animations (people walking, things exploding, etc).
Why not just use DMA to VRAM from a different location in ROM?
Espozo wrote:
Señor Ventura wrote:
If i'm transfering tiles from ROM to VRAM, Can the DMA transfer tiles from WRAM to VRAM simultaneously?.
No. DMA transfers happen one at a time.
But you can set up eight of them (minus however many HDMA channels you're using) before vblank and then have them all run consecutively.
Espozo wrote:
Señor Ventura wrote:
In this way maybe could be use the WRAM like a kind of caché for backgrounds animations (people walking, things exploding, etc).
Why not just use DMA to VRAM from a different location in ROM?
If the background animations are compressed, you need to DMA them from WRAM.* I seem to remember reading that when
Super Mario World shows "Nintendo Presents" before the title screen, it's decompressing a lot of graphics to WRAM.
Unless you're using a decompression ASIC like S-DD1, and you probably aren't.
tepples wrote:
But you can set up eight of them (minus however many HDMA channels you're using) before vblank and then have them all run consecutively.
So, is the same thing, except for the fact that it's done automatically.
The reason of my question is, if the bandwidth of the tile graphics works ever at 5.72KB, while samples are being transferred to the SPC700's RAM.
But now i see that, when graphics are transferred to VRAM, nothing is happening with the samples going to the SPC700's RAM... and vice versa.
tepples wrote:
If the background animations are compressed, you need to DMA them from WRAM.* I seem to remember reading that when Super Mario World shows "Nintendo Presents" before the title screen, it's decompressing a lot of graphics to WRAM.
Sounds logic, cause the cpu can't adressing to VRAM.
Señor Ventura wrote:
tepples wrote:
But you can set up eight of them (minus however many HDMA channels you're using) before vblank and then have them all run consecutively.
So, is the same thing, except for the fact that it's done automatically.
And therefore without losing bandwidth to the CPU setting up the registers for another DMA transfer.
Señor Ventura wrote:
The reason of my question is, if the bandwidth of the tile graphics works ever at 5.72KB, while samples are being transferred to the SPC700's RAM.
But now i see that, when graphics are transferred to VRAM, nothing is happening with the samples going to the SPC700's RAM... and vice versa.
Correct. As I understand it, the vast majority of Super NES games send audio data to the S-SMP using closed-loop programmed input and output (PIO). Block DMA works for graphics, but it's too fast for the S-SMP to receive. There are modern-day demos of using HDMA to send a few bytes each scanline open-loop, but I'm not aware of that being done in the Super NES's original commercial era (pre-1999). It's also possible to alternate between sending a few bytes of audio using PIO during draw time and sending 5.something kB of graphics using block DMA during the following vblank; I think this is what
Jurassic Park does outside.
To emphasize: there is no reason at all why audio data would need to be transferred during VBlank. The APU doesn't even know about VBlank; it's on a completely separate clock and cannot receive interrupts. You can transfer data to it at any time during a frame.
The CPU can, in principle, poke data into VRAM manually, since there's a direct access port. It's just way faster to use DMA for anything more than a few sequential bytes.
You can also DMA from the cartridge to WRAM during active display - UNLESS you're using HDMA, because the earliest version of the S-CPU had a bug whereby DMA and HDMA could trip over each other and lock up the system. Very aggravating...
93143 wrote:
You can also DMA from the cartridge to WRAM during active display - UNLESS you're using HDMA, because the earliest version of the S-CPU had a bug whereby DMA and HDMA could trip over each other and lock up the system. Very aggravating...
Attached are official details on that bug (because I'm certain someone will want details on it).
Summarized in my own words:
ProblemIf a block DMA finishes immediately before HDMA starts, S-CPU version 1 might freeze. This may occur especially in the first 2.24 microseconds (48 master clocks) after the start of horizontal blanking.
Workarounds- Beginner: Don't enable block DMA while HDMA is enabled. If you use HDMA, do your block DMA during vertical blanking, or do it on a scanline when HDMA is inactive.
- Advanced: Carefully time your block DMA to finish outside horizontal blanking. Wait for an H-count interrupt, and adjust the start time until it works without crashing on a 1/1/1 (the yellowest of the yellow).
tepples wrote:
Advanced: Carefully time your block DMA to finish outside horizontal blanking. Wait for an H-count interrupt, and adjust the start time until it works without crashing on a 1/1/1 (the yellowest of the yellow).
Timed code after an interrupt? It seems like just regularly copying data would be faster at this point, although I don't have a clue how long HBlank actually is.
Does anyone know the prevalence of different SNES's? Mine is a 2/1/3. Kind of a silly idea, but if you're really pressed for CPU time, you could check to see what version the CPU is, and if it's 1, do the less conventional ways to ensure that the CPU doesn't freeze but that could potentially lead to slowdown, which isn't good, but obviously better. I really don't get the point of console reversions where something like this is purposely ironed out, because it's hard not to want to disregard the quirks older versions and I bet that companies want games to work on all systems. (I think I heard something about this with the original PlayStation.) I can't actually think of any games that only work on one system version though, not just including the SNES. (I'm not counting something like Gameboy games that only work on the Gameboy color, because the Gameboy color is advertised as being a different system.)
I'm not sure if this is the right thread or not, but since this is a "beat'mup" thread I think it's okay.
I've been trying to investigate the reason why Street Fighter series have such thick letterboxing. I'm thinking that Street Fighter 2 might be uploading every 16x16 sprite pattern separately, since the frames use a lot of recycled parts.