Have you spent time in an emulator's real-time debugger? This will almost certainly, 100%, tell you what is going on / why your code is "going crazy". Chances are you have a loop that is either never ending (and things are going off into lala land, address-wise) or you have some buggy code (duh).
I don't see any PHX/PHY/PLY/PLX code in these. And because of the naming convention + lack of good description of what you changed
and where, it takes me a lot of time to sift through this to see what all you did that might be causing the problem. I'll try to focus on what looks like it may be a problem:
Code:
Sprites.asm --
.DEFINE SpriteBuf1 $0400
.DEFINE SpriteBuf2 $0600
MetaspriteTest2.asm --
.DEFINE TempX $1800
.DEFINE TempY $1802
.DEFINE MapX $1804
.DEFINE MapY $1806
InfiniteLoop:
WAI
ldy #$00
ldx #MetaspriteTable
sty TempY
stx TempX
jsr start_metasprite
Metasprite2.asm --
start_metasprite:
php
sep #$10
rep #$10
ldy TempY
ldx TempX
build_metasprite:
lda $00,x
beq metasprite_done
inx
lda $00,x
sta SpriteBuf1,y
inx
lda $00,x
sta SpriteBuf1+1,y
iny
inx
iny
iny
iny
bra build_metasprite
metasprite_done:
plp
rts
MetaspriteTable:
.DB $01,$00,$00,$01,$10,$00,$00
1. Based on past conversation, I believe indexes to be 16-bit the time the LDX/LDY statements in InifiniteLoop are called. I'm just noting this here because it matters.
2. The SEP #$10/REP #$10 statement in the start_metasprite routine makes no sense -- you're setting 8-bit indexes, followed immediately by setting 16-bit indexes. That means I'm not sure what the accumulator size is at the time the later code is run, and I've covered why this matters with the assembler (since it tracks what the register sizes are as best as it can -- you could verify it if WLA DX wasn't a pile of junk with listing files... sigh...), but it also matters when your code is being run in real-time.
To me, based on the code in build_metasprite, it clearly looks like you want an 8-bit accumulator because you're doing stuff like this:
Code:
lda $00,x
beq metasprite_done
inx
lda $00,x
sta SpriteBuf1,y
inx
lda $00,x
sta SpriteBuf1+1,y
Let's assume X=$8F00 and Y=$0000 when we enter this code, just for understanding what is going on here. Again, I'm discussing this because your goal is trying to convert a routine to use 16-bit values. The code then gets executed like this in real-time (see comments) -- I'm excluding the BRA statement because it's implied/unconditional:
Code:
lda $00,x ; Load accumulator from $0000,x (effective address = $8F00)
beq metasprite_done ; If zero, branch -- value is not zero
inx ; X = $0001
lda $00,x ; Load accumulator from $0000,x (effective address = $8F01)
sta SpriteBuf1,y ; Store at $0400,y (effective address = $0400)
inx ; X = $0002
lda $00,x ; Load accumulator from $0000,x (effective address = $8F02)
sta SpriteBuf1+1,y ; Store at $0400+1,y (effective address = $0401)
iny ; Y = $0001
inx ; X = $0003
iny ; Y = $0002
iny ; Y = $0003
iny ; Y = $0004
lda $00,x ; Load accumulator from $0000,x (effective address = $8F03)
beq metasprite_done ; If zero, branch -- value is not zero
inx ; X = $0004
lda $00,x ; Load accumulator from $0000,x (effective address = $8F04)
sta SpriteBuf1,y ; Store at $0400,y (effective address = $0404)
inx ; X = $0005
lda $00,x ; Load accumulator from $0000,x (effective address = $8F05)
sta SpriteBuf1+1,y ; Store at $0400+1,y (effective address = $0405)
iny ; Y = $0005
inx ; X = $0006
iny ; Y = $0007
iny ; Y = $0008
iny ; Y = $0009
Do you now see -- when considering the offsets (effective addresses) discussed -- how/why 8-bit accumulator vs. 16-bit accumulator matters here? A 16-bit accumulator, during your
ldx $00,x routine, is going to load two bytes from that address -- e.g. $8F00 and $8F01 in a single load. Yet your INX statements seem to indicate you're working with things at a byte level.
So here's the thing -- and I have not checked to see if this is the case (this is for you to do!): the wrong accumulator size could in fact account for this loop never ending and a lot of weird things going on (depends on what the NMI handler is doing). The only way the loop ends is if the accumulator, when loading from $0000,X, is zero. With a 16-bit accum that means the value has to be $0000, with an 8-bit accum that means the value has to be $00.
So let's take a look at what would happen with a 16-bit accumulator (again, not sure if this is the case! -- for you to figure out!), showing the actual values that are getting read/written. Again, assuming X=$8F00, Y=$0000 when hitting this code. And remember: the 65816 is little-endian (in case you wonder why some of the "values loaded" are "reversed"):
Code:
lda $00,x ; Effective address $8F00, A=$0001 (bytes 0,1 of MetaspriteTable)
beq metasprite_done ; Not zero, don't branch
inx ; X=$8F01
lda $00,x ; Effective address $8F01, A=$0000 (bytes 1,2 of MetaspriteTable0
sta SpriteBuf1,y ;
inx ; X=$8F02
lda $00,x ; Effective address $8F02, A=$0100 (bytes 2,3 of MetaspriteTable)
sta SpriteBuf1+1,y ;
iny ; Y=$0001
inx ; X=$8F03
iny ; Y=$0002
iny ; Y=$0003
iny ; Y=$0004
lda $00,x ; Effective address $8F03, A=$1001 (bytes 3,4 of MetaspriteTable)
beq metasprite_done ; Not zero, don't branch
inx ; X=$8F04
lda $00,x ; Effective address $8F04, A=$0010 (bytes 4,5 of MetaspriteTable)
sta SpriteBuf1,y ;
inx ; X=$8F05
lda $00,x ; Effective address $8F05, A=$0000 (bytes 5,6 of MetaspriteTable)
sta SpriteBuf1+1,y ;
iny ; Y=$0005
inx ; X=$8F06
iny ; Y=$0006
iny ; Y=$0007
iny ; Y=$0008
lda $00,x ; Effective address $8F06, A=$??00 (bytes 6,? of MetaspriteTable)
beq metasprite_done ; ??? does this branch? Depends on if the byte "past the end of the table" is zero!
Make sure you note the last couple lines. Compare that to what's in your table. I simply do not know what WLA DX is storing "after" the last byte you define in MetaspriteTable. But you want my guess? I'd be willing to bet you that it's got a non-zero value there -- for example it could be the first byte of .\\GamePictures\\hovertransport.map. Again (I'm like a broken record): a listing file would tell you this. And thus if your accumulator is 16-bit when that code runs, things would never end, and you'd end up with a loop that essentially writes all over RAM indefinitely (SpriteBuf1 points to $0400, but in an infinite loop with indexes being used, you would end up writing to all of memory space within $0000-FFFF ($0000-1FFF is RAM, the other writes would essentially do nothing right now)), and reads from all over the bank the code is executing it -- the result would be a total utter mess and almost certainly cause bizarre behaviour program-wise and visually.
P.S. -- This does not take someone "half a minute" to fix. My posts to you have actually taken hours of time as an aggregate (this one took almost 2 hours). Debugging someone else's code is always time-consuming because the person has to reverse-engineer it every single time (esp. when there's no clear explanation of what changed, i.e. "this is what it was before, and this is what I changed it to"). The author of the code should know what their program is doing better than someone who's being asked for help. Just a general rule, haha. :-)
And no, the developers manual would not help you here. This is purely a programming thing. I think you need to get more familiar with using a real-time debugger, because it could save you a lot of time assuming you know what it is you're looking at. Yes, the tools (emulator debuggers) tend to suck and they usually lack source code integration (so what you see in the debugger doesn't correlate with the code you wrote; all values are pre-assembled, again why a listing file is helpful...), so I understand the pain. But once you go through it a few times you'll get in the habit of knowing what you're seeing.
The one thing I wish emulators had was tie-ins for forcing a break/drop-to-debugger using native code. On the 65816, honestly I feel the best way to do this would be to use either BRK (but some games might use this legitimately?), COP, or WDM opcodes (the latter is certainly never used -- opcode $42 is the one leftover byte in the opcode table that is supposed to act as a NOP when run, so it'd work great for this, IMO). BRK and COP at least have 1 byte operands ("signature bytes"). It'd be really convenient to tell someone "Just put
brk $ff where you want the debugger to kick in so you can step through your code", rather than futz with all of this.