16bit table indexing problem

16bit table indexing problem
by Drew Sebastino on 2015-01-17 (#139483)

I have no idea why, but a code I am trying to make doesn't work. I think there is a problem with me loading values from a table using a 16bit accumulator, because the first value I'm loading is $FFFF and beq-ing it, and for some reason, it thinks it is 0 and branches when it is not supposed to. (Either that, or my code is just really messed up. :oops:

) I tried setting up the table like .DB $FF,$FF,$00,$00 (and so on) because WLA said I couldn't do .DB $FFFF,$0000.

Here's my code:

Attachment:

Metasprite Demo.rar [222.56 KiB]
Downloaded 408 times

The code is just the metasprite one I had made, but I tried to make it work with 16bit values. A sprite still appears on the screen, but only because I have it set to using hioam.

Re: 16bit table indexing problem
by koitsu on 2015-01-17 (#139485)

You want .dw $ffff,$0000, by the way. That means "define word", and a word on 65816 is 16-bit. .dw means define-word, .db means define-byte.

I wouldn't be surprised if .db $ff,$ff caused the assembler to generate odd code when using 16-bit lda/etc. statements that reference them. Did you generate an assembly listing (I explained how to do this in another thread recently -- the procedure is the same but use wla-65816.exe for assembling of course) and see if the code being generated is what you expect?

P.S. -- No I haven't looked at the source.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-17 (#139488)

I fixed the table, but sadly, it still doesn't work

. Also, about the assembly listing, do you create it by writing -i between wla and the filename? Will it work with the wla.bat file?

Re: 16bit table indexing problem
by koitsu on 2015-01-17 (#139492)

Espozo wrote:

I fixed the table, but sadly, it still doesn't work :(. Also, about the assembly listing, do you create it by writing -i between wla and the filename? Will it work with the wla.bat file?

You have to use -i as a flag to both the assembler and the linker. And of course it will work in a batch file -- there's nothing different about one of those compared to, say, doing it manually. I.e.:

Code:

wla-65816 -i -o %1.asm %1.obj
wlalink -i -vr temp.prj %1.fig

(Who uses FIG files in this day and age?!? Good lord, talk about an uncommon ROM format even back in the 90s...)

Also, could you at least give some relevant code bits/lines (and please reference the filename) where something like lda #$ffff / beq {label} is actually branching? (It should not -- ever. lda #$ffff would guarantee the zero flag in the CPU is not set, thus the beq would never happen). I have a feeling the lda statement you're using is actually loading a zero value from somewhere you're not expecting it, hence why the branch works.

There are 7 assembly-related files in this RAR, so being more specific would be quite helpful. I assume you know because you've run this through a debugger (ex. Geiger's SNES9x build with debugging) to see what the behaviour is.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-17 (#139493)

Oops...

MetaspriteTest.asm is the main file, which jumps to Metasprite. (don't worry about all the other files.) The part where it jumps to Metasprite is under "InfiniteLoop" and the table is right near the end of MetaspriteTest. You're going to have to change .db to .dw and combine the two bytes together.

Don't hesitate to ask any questions. I appreciate you trying to help me.

Quote:

(Who uses FIG files in this day and age?!? Good lord, talk about an uncommon ROM format even back in the 90s...)

The .bat file.

(I've honestly never heard of .fig files before.)

Edit: I just looked up and downloaded Snes9x debugger and wow does it look useful. (I currently use bsnes debugger)

Re: 16bit table indexing problem
by koitsu on 2015-01-17 (#139495)

I finally got it to assemble and link (I haven't bothered running it yet).

Wow, these listing files are absolute utter garbage. I don't even know what to say about this.

Code:

                                        
                                        ;=== Include MemoryMap, VectorTable, HeaderInfo ===
$78                                     .INCLUDE "Header.inc"
$18                                     
$FB                                     ;=== Include Library Routines & Macros ===
                                        .INCLUDE "LoadGraphics.asm"
$C2 $38                                 .INCLUDE "InitSNES2.asm"
                                        .INCLUDE "2input.asm"
$A9 $80                                 .INCLUDE "Sprites.asm"
$A9 $A0                                 .INCLUDE "Metasprite.asm"
$A9 $00                                 
$A9 $90                                 ;==============================================================================
$8D $21 $21                             ; main
$8D $21 $21                             ;==============================================================================
$A2 $FF $1F                             
$8D $21 $21                             .DEFINE MapX      $18
$8D $21 $21                             .DEFINE MapY      $1A
$A9 $01                                 .DEFINE XPosition   $1C
$A9 $01                                 .DEFINE YPosition   $1E
$9A                                     
$A9 $01                                 .BANK 0 SLOT 0
$A9 $01                                 .ORG 0
$A2 $24 $92                             .SECTION "WalkerCode"

This doesn't even make any sense no matter how you try to interpret it. All I've managed to determine is this: the assembler and/or linker (I don't know which) is outputting actual bytes that correlate with code of the actual program, but the lines it outputs do not correlate with the actual bytes shown -- e.g. assembler directives/pseudo-ops end up getting intermixed with actual assembly code.

For example:

Code:

$78                                     .INCLUDE "Header.inc"
$18                                     
$FB                                     ;=== Include Library Routines & Macros ===

$78 is sei, $18 is clc, $FB is xce. These are the first 3 instructions in InitSNES2.asm, so what the hell are they doing interspersed/intermixed with directives in MetaspriteTest.asm? I also noticed that each .asm file gets its own .lst file, except those ALSO look wrong! What this means is that this thing is designed completely wrong when it comes to use of .INCLUDE directives -- i.e. listing files cannot be reliably generated with this assembler.

Plain and simple: this assembler is a piece of shit. Who uses this thing?! Good god. It's no wonder nobody can get any programming done if the tools are in this kind of shape. x816 for DOS worked better than this.

I can tell this whole thing is solely focused around UNIX development environments, and reliance on actual GNU Makefiles. You know how I know that? Because there, you wouldn't be using .INCLUDE -- you'd have your Makefile reference every individual assembly file, and call wla-65816 on each one individually (i.e. 7 assembly files, 7 calls to wla-65816). I'm going to modify your wla.bat (heavily) to do this, to work around the obvious brokenness with listing file generation.

Consider me disgusted.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-17 (#139497)

koitsu wrote:

Plain and simple: this assembler is a piece of shit. Who uses this thing?! Good god. It's no wonder nobody can get any programming done if the tools are in this kind of shape. x816 for DOS worked better than this.

Are you referring to WLADX? What do you use then? Also, If I remember, the.bat file came with SNESstarterkit.

Re: 16bit table indexing problem
by tepples on 2015-01-17 (#139498)

koitsu wrote:

I can tell this whole thing is solely focused around UNIX development environments, and reliance on actual GNU Makefiles.

You don't need UNIX to run GNU makefiles. They run just fine in MSYS, a port of Bash, GNU Make, and GNU Coreutils to Windows.

One advantage of setting up a makefile (compared to a batch file that just reassembles everything all the time) is that if you change (say) an image file, Make will reconvert it to tiles, recompress it, reassemble the one source code file that contains an .incbin referencing the compressed tiles, and relink it. In large, complicated projects, this can save a lot of CPU and disk time over reconverting, recompressing, and reassembling everything in your project. This in turn means less time spent between Ctrl+R and the emulator opening, especially on an Atom netbook, though you can still force a full rebuild (make clean && make).

Quote:

You know how I know that? Because there, you wouldn't be using .INCLUDE -- you'd have your Makefile reference every individual assembly file, and call wla-65816 on each one individually (i.e. 7 assembly files, 7 calls to wla-65816).

The ca65 toolchain does the same thing, with ca65 generating .o files that ld65 combines into an executable .sfc, .nes, or whatever.

Re: 16bit table indexing problem
by koitsu on 2015-01-17 (#139501)

I'm not going to tell someone to install MSYS or Cygwin or other environments just to make a SNES ROM, Tepples. It's completely unnecessary and overkill. You should know better than to try and go that route with me. ;-)

My point stands: listing files with WLA DX are worthless if .INCLUDE is used anywhere.

I should note I tried separating this project up into how it "should" be done (using the project file correctly, assembling things as libraries, etc. then putting it all together during link-time) but gave up after repeatedly banging my head against a wall with this error message from the assembler: INTERNAL_PASS_1: A section must be open before any code/data can be accepted. (The file in question only contained code within .SECTION and .ENDS blocks (I even moved the macros into their own), so the message make no sense). The documentation is just as abysmal as it was the last time I looked at it (months ago).

I guess I'll say fuck listing files for now and I'll just go look at the thing in a real-time debugger and focus on the real problem.

Re: 16bit table indexing problem
by Sik on 2015-01-17 (#139502)

koitsu wrote:

I'm not going to tell someone to install MSYS or Cygwin or other environments just to make a SNES ROM, Tepples. It's completely unnecessary and overkill. You should know better than to try and go that route with me. ;-)

To be fair, you don't need any of those to use make (you just don't get *nix commands, but if your makefile doesn't use them then that doesn't matter - yes, that'd make sense with SNES development since you'll be using uncommon tools anyway). MinGW already comes with its own make (no, MSYS isn't needed), so one really could just take the make from MinGW (literally just one exe) and use that.

I guess one may want to distribute make on its own instead of telling people to get MinGW, but that'd work.

Re: 16bit table indexing problem
by 93143 on 2015-01-17 (#139504)

Espozo, are you still using the super old version of WLA DX, or have you managed to compile the latest one?

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-17 (#139505)

I honestly have no idea how to compile it. (If someone could post it already compiled in a rar, I would greatly appreciate it.)

Re: 16bit table indexing problem
by koitsu on 2015-01-17 (#139507)

Okay, I haven't really sifted through the entire program/code to figure out what all is going on here, but let's talk about the code and what it's actually doing. First, MetaspriteTest.asm has this around label InfiniteLoop. Apologies for formatting mistakes, as this code uses hard tabs rather than spaces (but not consistently):

Code:

InfiniteLoop:

   WAI

   ldy #$00
   ldx #MetaspriteTable
   jsr start_metasprite

...

MetaspriteTable:
   .DB $FF,$FF,$00,$00,$00,$00,$00,$00,$FF,$FF,$00,$10,$00,$00,$00,$00,$00,$00

This code gets turned into the below, effectively due to run-time register sizes and so on:

Code:

  wai
  ldy #$0000
  ldx #$8405
  jsr $820e

In the debugger (this took me a bit to do). Best place to set an exec breakpoint is at $8312.

Code:

Disassembly:
$00/8311 CB          WAI                     A:0000 X:0000 Y:0000 P:envmxdIZC
$00/8312 A0 00 00    LDY #$0000              A:0000 X:0000 Y:0000 P:envmxdIZC
$00/8315 A2 05 84    LDX #$8405              A:0000 X:0000 Y:0000 P:envmxdIZC
$00/8318 20 0E 82    JSR $820E  [$00:820E]   A:0000 X:0000 Y:0000 P:envmxdIZC

$8405 happens to be the 16-bit address (pre-calculated during assembly-time) of the memory location of MetaspriteTable. Let's start by asking: is that the correct address? Let's find out. And of course SNES9x won't let me copy/paste the Hex Editor portion... wonderful, so I get to type all of this in manually:

Code:

008400 8D 10 21 28 60 FF FF 00 00 00 00 00 00 FF FF 00
008410 10 00 00 00 00 00 00 FF FF FF FF FF FF FF FF FF
008420 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

That looks correct. So with that in mind, let's go look at start_metasprite and see what it's doing with the X register.

Code:

start_metasprite:
   php
   rep #$10
   sep #$30

build_metasprite:
   lda $0000,x
   beq metasprite_done
   inx
   lda $0000,x
   clc

Initially this look right, but I can already see multiple catastrophic bugs given the assumptions of the programmer vs. what the processor will do. Let's see what the real-time debugger has to say:

Code:

SNES reset.
$00/823C 78          SEI                     A:0000 X:0000 Y:0000 P:EnvMXdIZC
$00/8312 A0 00 00    LDY #$0000              A:00FF X:0000 Y:1000 P:envMxdiZC
$00/8315 A2 05 84    LDX #$8405              A:00FF X:0000 Y:0000 P:envMxdiZC
$00/8318 20 0E 82    JSR $820E  [$00:820E]   A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/820E 08          PHP                     A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/820F C2 10       REP #$10                A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8211 E2 30       SEP #$30                A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8213 B5 00       LDA $00,x  [$00:0005]   A:00FF X:0005 Y:0000 P:eNvMXdizC
$00/8215 F0 23       BEQ $23    [$823A]      A:0000 X:0005 Y:0000 P:envMXdiZC

Your REP/SEP are in the wrong order. By doing REP #$10 / SEP #$30, you are setting 16-bit indexes, then setting 8-bit accumulator and 8-bit indexes. There are two ways to solve this, but one is wrong and the other is right. These are your options:

Code:

  rep #$30   ; A=16, X/Y=16
  sep #$20  ; A=8

Or:

Code:

  sep #$30  ; A=8, X/Y=16
  rep #$10  ; X/Y=16

Guess which one is the correct way? The first one. The 2nd one will introduce a horrible bug: you'll lose the upper byte of the X/Y index registers -- it'll be zeroed. This happens on the 65816 and ONLY with the index registers. You can swap between 16-bit and 8-bit accumulator without the full 16-bit contents being affected, but with indexes, upon going to 8-bit you lose the upper byte. In fact, that's happening with your code already. Look closely at the contents of X:

Code:

$00/8211 E2 30       SEP #$30                A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8213 B5 00       LDA $00,x  [$00:0005]   A:00FF X:0005 Y:0000 P:eNvMXdizC

See how it goes from $8405 to $0005, all because you set 8-bit indexes?

So let's go with REP #$30 / SEP #$20 and see how things look after that:

Code:

$00/8318 20 0E 82    JSR $820E  [$00:820E]   A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/820E 08          PHP                     A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/820F C2 30       REP #$30                A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8211 E2 20       SEP #$20                A:00FF X:8405 Y:0000 P:eNvmxdizC
$00/8213 B5 00       LDA $00,x  [$00:8405]   A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8215 F0 23       BEQ $23    [$823A]      A:00FF X:8405 Y:0000 P:eNvMxdizC

Much better.

I should also point out here: you should not be using .dw $ffff,... like we discussed earlier. You are using an 8-bit accumulator, not a 16-bit accumulator (despite what you said earlier). And your build_metasprite routine is coded to use 8-bit accumulators as well. If you were to turn on 16-bit accumulator, your routine would break (look closely at where you're storing the results in SpriteBuf1 and what hard-coded math you're using there!).

I hope this has been a lesson in why having an assembler that generates proper/decent listings is VERY IMPORTANT. The fact WLA DX can't do this sanely/correctly is ridiculous.

I see other bugs in this program though, depending on how intelligent WLA DX is about knowing about addressing modes and banks, and when those crop up you're going to be crying big tears. Case in point: lda $0000,x right now is getting assembled into $b5 $00 (LDA directpage,X) because your MetaspriteTable data happens to be within the same bank as your the code that's running (in bank $00).

Eventually you're going to have to break outside of that (dealing with multiple banks); for example if MetaspriteTable was in a different bank (say bank $01 or $81 (same thing in this memory mode)), then the above code would be wrong and manifest itself by misbehaving in real-time: you'd be loading the wrong data: it'd be coming from bank $00 (where direct page is hard-coded to live) rather than where B was. In other words:

Code:

  rep #$10
  ldx #MetaspriteTable
  sep #$20
  lda #$01
  pha
  plb
  lda $0000,x

WLA DX may end up screwing you by optimising lda $0000,x into $b5 $00 (LDA directpage,X), rather than $ad $00 $00 (LDA absolute,X). The former would get you whatever bytes happened to be in bank $00 address $0000 + X, the latter would get you whatever bytes happened to be in bank $01 address $0000 + X.

The only way to "force" the assembler into knowing this is to use the .w modifier, i.e.:

Code:

lda.w $0000,x

Which will ALWAYS assemble to the 16-bit absolute address + opcode ($ad $00 $00).

Alternately you could use full 24-bit addressing ("long addressing") on all of your stuff that's not explicitly in direct page, through the .l (dot-ELL) modifier. Be aware that 24-bit addressing takes up 1 more byte, and takes 1 more cycle than absolute. This added cycle can add up real fast when doing loops, which is why before many loops you'll find people changing what B points to and then using absolute addressing.

Food for thought.

Re: 16bit table indexing problem
by koitsu on 2015-01-17 (#139508)

Why are people bothering to discuss compiling WLA DX? The binaries are already available per the home page:

http://www.villehelin.com/wla.html

Quote:

Binaries (UNSUPPORTED):

Win32 (link)

Go ahead and get the ones labelled version 9.5 / 02-Nov-2013.

Re: 16bit table indexing problem
by 93143 on 2015-01-17 (#139510)

...eh?

I was sure I got it from there...

...but I misremembered. I'm using Neviksti's SNES starter kit, which is not the same as that one tutorial that said to download WLA DX separately. The starter kit includes binaries from 2003.

Okay, that straightens that out, at least on my end. Sorry, Espozo.

And yes, I will probably end up switching to ca65, mostly because it has Super FX support.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-17 (#139511)

I really had no idea I was using an 8 bit accumulator. :oops:

I really just don't understand what REP and SEP mean. I actually had this code working originally, (see SNES Programing Help) but I wanted to use 16bit values because it is apparently easier to use the 9th x bit. I tried this code now, but the screen now turns black and all memory seems to get erased. (I think this uses a 16bit accumulator.)

Code:

.BANK 0 SLOT 0
.ORG 0
.SECTION "MetaspriteCode"

start_metasprite:
   php
   rep #$30
   sep #$10

_build_metasprite:
lda $0000,x
beq _metasprite_done
inx
lda $0000,x
clc
adc XPosition
and #$FF00          ;I switched it from #$00FF, to #$FF00
sta SpriteBuf1,y
inx
lda $0000,x
clc
adc YPosition
and #$FF00
sta SpriteBuf1+1,y
inx
iny
iny
iny
iny
bra _build_metasprite

_metasprite_done:
rts

metasprite_done:
   plp
   rts

.ENDS

Now the Table:

Code:

MetaspriteTable:
   .DW $FFFF,$0000,$0000,$FFFF,$0010,$0000,$0000

I know that the upper byte of x and y is being erased, but it shouldn't cause the game to crash I don't think. (how would you keep the upper byte? having the 16 bit accumulator ready before you load the table? and go to the Metasprite routine?)

Also, the reason for the formatting of the code to not be consistent is because of copy and pasting.

Quote:

Okay, that straightens that out, at least on my end. Sorry, Espozo.

No harm done.

Quote:

And yes, I will probably end up switching to ca65, mostly because it has Super FX support.

Will you have to rewrite the code to make it work on ca65?

Re: 16bit table indexing problem
by koitsu on 2015-01-17 (#139515)

REP and SEP allow you to set and unset (adjust) the bits of P (processor status). REP stands for "REset P" (set bits to 0) and SEP stands for "SEt P" (set bits to 1). Any bit you set to 1 in the operand means "affect that bit". That should help explain why something like SEP #$20 sets bit 5 to 1 (8-bit accumulator), and why REP #$20 would set bit 5 to 0 (16-bit accumulator).

The bits in P are ones you already rely on heavily anyway, but the 65816 allows you to tweak those directly (rather than through explicit opcodes, e.g. CLC) if you wish. These are mostly compatible with the 6502 and 65c02, but not entirely:

Code:

Bit 7 (n) = negative flag (1 = most-significant bit (MSB) set, 0 = MSB unset)
Bit 6 (v) = overflow flag (1 = two's complement error, 0 = two's complement OK)
Bit 5 (m) = accumulator register size flag (1 = 8-bit, 0 = 16-bit)
Bit 4 (x) = index register size flag (1 = 8-bit, 0 = 16-bit)
Bit 3 (d) = decimal mode flag (1 = BCD enabled, 0 = BCD disabled)
Bit 2 (i) = IRQ flag (1 = disable IRQ, 0 = enable IRQ)
Bit 1 (z) = zero flag (1 = last result was zero)
Bit 0 (c) = carry flag (1 = last result required carry)

The letters in parentheses are the common abbreviation for the individual bit in P. For the register size flags, in some emulators/tools, when they're capitalised it means the bit is unset (e.g. x means 8-bit, X means 16-bit). The capitalisation method, however, is not universal, but the letters are.

There's also the emulation bit (e) that isn't part of P (it's a separate bit), which is what defines if the CPU is in 65c02 emulation mode or not (e=0 means native 65816 mode, e=1 means 65c02 emulation mode). Please don't think about this though, you're doing real 65816 (thankfully) and don't get hung up on this paragraph.

Your question makes me wonder if you have a 65816 reference book at all. If not, I'd recommend getting the copy we keep of the one Western Design Center had up publicly for a while. It's at the bottom of this page, and contains actual real-world code examples with thorough explanations (not just in-line comments) of how things work, along with full opcode descriptions, charts, and so on:

http://wiki.nesdev.com/w/index.php/Programming_guide

Before WDC got their hands on it, it was a book created by Ron Lichty and David Eyes, and was the #1 resource in the 90s for 65816 programming (and works quite well for learning 6502/65c02 too, actually). I thankfully have a real paperback copy and have had it since I was a kid doing 65816. I've used it so much the cover has been torn off.

Anyway, I simply got in the habit of remembering the following "chart" by heart:

Code:

REP #$10 = 16-bit indexes
REP #$20 = 16-bit accumulator
REP #$30 = 16-bit accumulator and 16-bit indexes
SEP #$10 = 8-bit indexes (upper top byte of X/Y lost/zeroed)
SEP #$20 = 8-bit accumulator
SEP #$30 = 8-bit accumulator and 8-bit indexes

You therefore cannot, in a single SEP or REP statement, do something like "use 8-bit accumulator and 16-bit indexes" -- it requires two opcodes given the nature of how the processor bits work and how SEP/REP work.

Screwing with any of the other bits of P directly through REP/SEP is considered "extremely uncommon" (don't let anyone tell you otherwise). For example: if you look at the InitSNES macro, for whatever reason someone did REP #$38, rather than CLD / REP #$30. They are the same thing, and whoever did that just wanted to save 2 cycles + 1 byte, even though it doesn't really matter (routine is only called once during reset). As I've stated in the past, I consider this kind of optimisation completely unnecessary and is often done by people (with the context in question kept in mind) just to "show off". I don't think it makes for good learning material.

Espozo wrote:

I really had no idea I was using an 8 bit accumulator. :oops: I really just don't understand what REP and SEP mean. I actually had this code working originally, (see SNES Programing Help) but I wanted to use 16bit values because it is apparently easier to use the 9th x bit. I tried this code now, but the screen now turns black and all memory seems to get erased. (I think this uses a 16bit accumulator.)

Code:

.BANK 0 SLOT 0
.ORG 0
.SECTION "MetaspriteCode"

start_metasprite:
   php
   rep #$30
   sep #$10
...

This code sets 16-bit accumulator and 16-bit indexes, then proceeds to set 8-bit indexes. This means you've lost the upper byte of the X/Y index registers, which is the entire problem I told you about.

Furthermore, using 16-bit accumulator, as I told you, would break your program/misbehave because of how it's written.

You have to change your actual code in build_metasprite, and the changes you did do are flat out incorrect -- you need to think about it a bit more (pay close attention to what you're doing with SpriteBuf1 and *WHERE* (offset-wise) within SpriteBuf1 you're writing data, how much data (in bytes) you're writing compared to previously, and how you're later using SpriteBuf1. All of that matters!). Pay close attention to how many times you're calling INX/INY (make sure that's correct vs. how you use those indexes both for reading and writing). And don't forget that YPosition and XPosition will probably need to increase in size as well (I think these are simply memory locations that are read, and so you'll need to make sure those are words not bytes). I feel I've spent enough time on this already, especially for one day. :P

Quote:

I know that the upper byte of x and y is being erased, but it shouldn't cause the game to crash I don't think. (how would you keep the upper byte? having the 16 bit accumulator ready before you load the table? and go to the Metasprite routine?)

I explained how you "keep" the upper byte of X/Y when using REP/SEP, and why you do it a certain way. :/ If you really need to save this, then you need to store the 16-bit X value somewhere temporary (or push it onto the stack) before you change its size, and when you want to restore the value, you need to switch back to 16-bit indexes and then restore it (or pull it off the stack). Just don't forget that if you push a 16-bit value onto the stack, you need to make sure to pull it off later also as 16-bit, otherwise you'll have a stack underflow situation eventually.

Temporary variable method:

Code:

  ;
  ; Assume 16-bit indexes here
  ;
  stx TempX  ; Store 16-bit X in TempX variable (should be in direct page or RAM, needs to be a word/2 bytes)
  sty TempY  ; Store 16-bit Y in TempY variable (should be in direct page or RAM, needs to be a word/2 bytes)
  sep #$10   ; 8-bit indexes -- top byte of X and Y are lost
  ;
  ; Your code here does something with 8-bit indexes...
  ;
  ; Now we're done and need to go back to 16-bit indexes and restore what X/Y were...
  ;
  rep #$10   ; 16-bit indexes
  ldy TempY  ; Load 16-bit Y with content of TempY variable
  ldx TempX  ; Load 16-bit X with content of TempX variable

Stack method:

Code:

  ;
  ; Assume 16-bit indexes here
  ;
  phx       ; Push 16-bit X onto stack
  phy       ; Push 16-bit Y onto stack
  sep #$10  ; 8-bit indexes -- top byte of X and Y are lost
  ;
  ; Your code here does something with 8-bit indexes...
  ;
  ; Now we're done and need to go back to 16-bit indexes and restore what X/Y were...
  ;
  rep #$10  ; 16-bit indexes
  ply       ; Pull 16-bit value off stack and put into Y
  plx       ; Pull 16-bit value off stack and put into X

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-18 (#139578)

I was going to come back here complaining about how I tried the stack method on the metasprite routine and it didn't work but I just realized that I didn't pull in the reverse order pushed. Heh... (I've reverted back to the original 8 bit code I made, but I figured I could work on it a little more to prepare myself to make it 16 bit) The main reason I wanted to post here again is to ask if there is a way to view PDF files on Windows 8 not in the obnoxious full screen "app mode" (or whatever) because I kind of want to look at the SNES programming manual and try to program something without having to keep minimizing the window. (In my opinion, Windows 8 seems like a step back in many respects, but I know support for Windows 7 is eventually going to be dropped...)

This is what I mean:

Attachment:

Screenshot (140).png [ 33.71 KiB | Viewed 3921 times ]

Re: 16bit table indexing problem
by lidnariq on 2015-01-18 (#139582)

Install any non-"metro" PDF reader. Foxit seems to be popular.
("Metro" being the codename for the awful full-screen tablet windows 8 UI thing.)

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-18 (#139586)

Thank You! Windows 8 seems way more "tablet oriented". (Basically, make it stupidly easy to use so a kindergartener can play on it by stripping things features from it.)

Re: 16bit table indexing problem
by nicklausw on 2015-01-18 (#139589)

Espozo wrote:

I honestly have no idea how to compile it. (If someone could post it already compiled in a rar, I would greatly appreciate it.)

Visual Studio and a bunch of other crap is needed. Don't use the binaries given on Ville's website, as those are somewhat outdated (and some bug fixes along with improvements have been made since then).

By the assumption that you're using Windows, here's my binaries. Fresh off GitHub. Made them last month as I recall.

EDIT: just remembered I also compiled them for Ubuntu, so I'll put those there aswell. Zip = windows, Tar = Ubuntu.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-19 (#139621)

Espozo wrote:

I was going to come back here complaining about how I tried the stack method on the metasprite routine and it didn't work but I just realized that I didn't pull in the reverse order pushed. Heh...

Of course, I didn't even leave it on for a whole minute to see the game crash... I honestly have no clue what the problem is. I have a file in the folder named MetaspriteTest that works perfectly fine, and MetaspriteTest2 that crashes after a minute. (Sprites go all over the place, then VRAM gets totally wiped out and a whole bunch of nonsense in written to WRAM) The only difference between these two codes is that I wrote PHY and PHX before I jumped to Metasprite2 and then wrote PLX and PLY when I got there. I swear, do I have a brain disorder or something? Because I write something that makes perfect sense to me, it doesn't work in the slightest, and then someone fixes it in under half a minute when I've been staring at it for an hour.

Attachment:

Metasprite Demo.rar [237.96 KiB]
Downloaded 188 times

By the way, I haven't read all of the SNES programming manual (obviously), but I'm not sure how much it would help me with this particular problem.

Re: 16bit table indexing problem
by koitsu on 2015-01-19 (#139633)

Have you spent time in an emulator's real-time debugger? This will almost certainly, 100%, tell you what is going on / why your code is "going crazy". Chances are you have a loop that is either never ending (and things are going off into lala land, address-wise) or you have some buggy code (duh).

I don't see any PHX/PHY/PLY/PLX code in these. And because of the naming convention + lack of good description of what you changed and where, it takes me a lot of time to sift through this to see what all you did that might be causing the problem. I'll try to focus on what looks like it may be a problem:

Code:

Sprites.asm --

.DEFINE SpriteBuf1   $0400
.DEFINE SpriteBuf2   $0600

MetaspriteTest2.asm --

.DEFINE TempX      $1800
.DEFINE TempY      $1802
.DEFINE MapX      $1804
.DEFINE MapY      $1806

InfiniteLoop:

   WAI

   ldy #$00
   ldx #MetaspriteTable
   sty TempY
   stx TempX
   jsr start_metasprite

Metasprite2.asm --

start_metasprite:
   php
   sep #$10
   rep #$10
   ldy TempY
   ldx TempX

build_metasprite:
   lda $00,x
   beq metasprite_done
   inx
   lda $00,x
   sta SpriteBuf1,y
   inx
   lda $00,x
   sta SpriteBuf1+1,y
   iny
   inx
   iny
   iny
   iny
   bra build_metasprite

metasprite_done:
   plp
   rts

MetaspriteTable:
   .DB $01,$00,$00,$01,$10,$00,$00

1. Based on past conversation, I believe indexes to be 16-bit the time the LDX/LDY statements in InifiniteLoop are called. I'm just noting this here because it matters.

2. The SEP #$10/REP #$10 statement in the start_metasprite routine makes no sense -- you're setting 8-bit indexes, followed immediately by setting 16-bit indexes. That means I'm not sure what the accumulator size is at the time the later code is run, and I've covered why this matters with the assembler (since it tracks what the register sizes are as best as it can -- you could verify it if WLA DX wasn't a pile of junk with listing files... sigh...), but it also matters when your code is being run in real-time.

To me, based on the code in build_metasprite, it clearly looks like you want an 8-bit accumulator because you're doing stuff like this:

Code:

   lda $00,x
   beq metasprite_done
   inx
   lda $00,x
   sta SpriteBuf1,y
   inx
   lda $00,x
   sta SpriteBuf1+1,y

Let's assume X=$8F00 and Y=$0000 when we enter this code, just for understanding what is going on here. Again, I'm discussing this because your goal is trying to convert a routine to use 16-bit values. The code then gets executed like this in real-time (see comments) -- I'm excluding the BRA statement because it's implied/unconditional:

Code:

        lda $00,x               ; Load accumulator from $0000,x (effective address = $8F00)
        beq metasprite_done     ; If zero, branch -- value is not zero
        inx                     ; X = $0001
        lda $00,x               ; Load accumulator from $0000,x (effective address = $8F01)
        sta SpriteBuf1,y        ; Store at $0400,y (effective address = $0400)
        inx                     ; X = $0002
        lda $00,x               ; Load accumulator from $0000,x (effective address = $8F02)
        sta SpriteBuf1+1,y      ; Store at $0400+1,y (effective address = $0401)
        iny                     ; Y = $0001
        inx                     ; X = $0003
        iny                     ; Y = $0002
        iny                     ; Y = $0003
        iny                     ; Y = $0004

        lda $00,x               ; Load accumulator from $0000,x (effective address = $8F03)
        beq metasprite_done     ; If zero, branch -- value is not zero
        inx                     ; X = $0004
        lda $00,x               ; Load accumulator from $0000,x (effective address = $8F04)
        sta SpriteBuf1,y        ; Store at $0400,y (effective address = $0404)
        inx                     ; X = $0005
        lda $00,x               ; Load accumulator from $0000,x (effective address = $8F05)
        sta SpriteBuf1+1,y      ; Store at $0400+1,y (effective address = $0405)
        iny                     ; Y = $0005
        inx                     ; X = $0006
        iny                     ; Y = $0007
        iny                     ; Y = $0008
        iny                     ; Y = $0009

Do you now see -- when considering the offsets (effective addresses) discussed -- how/why 8-bit accumulator vs. 16-bit accumulator matters here? A 16-bit accumulator, during your ldx $00,x routine, is going to load two bytes from that address -- e.g. $8F00 and $8F01 in a single load. Yet your INX statements seem to indicate you're working with things at a byte level.

So here's the thing -- and I have not checked to see if this is the case (this is for you to do!): the wrong accumulator size could in fact account for this loop never ending and a lot of weird things going on (depends on what the NMI handler is doing). The only way the loop ends is if the accumulator, when loading from $0000,X, is zero. With a 16-bit accum that means the value has to be $0000, with an 8-bit accum that means the value has to be $00.

So let's take a look at what would happen with a 16-bit accumulator (again, not sure if this is the case! -- for you to figure out!), showing the actual values that are getting read/written. Again, assuming X=$8F00, Y=$0000 when hitting this code. And remember: the 65816 is little-endian (in case you wonder why some of the "values loaded" are "reversed"):

Code:

        lda $00,x               ; Effective address $8F00, A=$0001 (bytes 0,1 of MetaspriteTable)
        beq metasprite_done     ; Not zero, don't branch
        inx                     ; X=$8F01
        lda $00,x               ; Effective address $8F01, A=$0000 (bytes 1,2 of MetaspriteTable0
        sta SpriteBuf1,y        ;
        inx                     ; X=$8F02
        lda $00,x               ; Effective address $8F02, A=$0100 (bytes 2,3 of MetaspriteTable)
        sta SpriteBuf1+1,y      ;
        iny                     ; Y=$0001
        inx                     ; X=$8F03
        iny                     ; Y=$0002
        iny                     ; Y=$0003
        iny                     ; Y=$0004

        lda $00,x               ; Effective address $8F03, A=$1001 (bytes 3,4 of MetaspriteTable)
        beq metasprite_done     ; Not zero, don't branch
        inx                     ; X=$8F04
        lda $00,x               ; Effective address $8F04, A=$0010 (bytes 4,5 of MetaspriteTable)
        sta SpriteBuf1,y        ;
        inx                     ; X=$8F05
        lda $00,x               ; Effective address $8F05, A=$0000 (bytes 5,6 of MetaspriteTable)
        sta SpriteBuf1+1,y      ;
        iny                     ; Y=$0005
        inx                     ; X=$8F06
        iny                     ; Y=$0006
        iny                     ; Y=$0007
        iny                     ; Y=$0008

        lda $00,x               ; Effective address $8F06, A=$??00 (bytes 6,? of MetaspriteTable)
        beq metasprite_done     ; ??? does this branch?  Depends on if the byte "past the end of the table" is zero!

Make sure you note the last couple lines. Compare that to what's in your table. I simply do not know what WLA DX is storing "after" the last byte you define in MetaspriteTable. But you want my guess? I'd be willing to bet you that it's got a non-zero value there -- for example it could be the first byte of .\\GamePictures\\hovertransport.map. Again (I'm like a broken record): a listing file would tell you this. And thus if your accumulator is 16-bit when that code runs, things would never end, and you'd end up with a loop that essentially writes all over RAM indefinitely (SpriteBuf1 points to $0400, but in an infinite loop with indexes being used, you would end up writing to all of memory space within $0000-FFFF ($0000-1FFF is RAM, the other writes would essentially do nothing right now)), and reads from all over the bank the code is executing it -- the result would be a total utter mess and almost certainly cause bizarre behaviour program-wise and visually.

P.S. -- This does not take someone "half a minute" to fix. My posts to you have actually taken hours of time as an aggregate (this one took almost 2 hours). Debugging someone else's code is always time-consuming because the person has to reverse-engineer it every single time (esp. when there's no clear explanation of what changed, i.e. "this is what it was before, and this is what I changed it to"). The author of the code should know what their program is doing better than someone who's being asked for help. Just a general rule, haha. :-)

And no, the developers manual would not help you here. This is purely a programming thing. I think you need to get more familiar with using a real-time debugger, because it could save you a lot of time assuming you know what it is you're looking at. Yes, the tools (emulator debuggers) tend to suck and they usually lack source code integration (so what you see in the debugger doesn't correlate with the code you wrote; all values are pre-assembled, again why a listing file is helpful...), so I understand the pain. But once you go through it a few times you'll get in the habit of knowing what you're seeing.

The one thing I wish emulators had was tie-ins for forcing a break/drop-to-debugger using native code. On the 65816, honestly I feel the best way to do this would be to use either BRK (but some games might use this legitimately?), COP, or WDM opcodes (the latter is certainly never used -- opcode $42 is the one leftover byte in the opcode table that is supposed to act as a NOP when run, so it'd work great for this, IMO). BRK and COP at least have 1 byte operands ("signature bytes"). It'd be really convenient to tell someone "Just put brk $ff where you want the debugger to kick in so you can step through your code", rather than futz with all of this.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-19 (#139637)

Wow, I'm a dumbass, I accidentally uploaded the wrong file. (That's why there is no phy, and phx because it is in the wrong folder.)

Attachment:

REAL Metasprite Demo.rar [238.46 KiB]
Downloaded 176 times

(no one needs to look at this, I'm just saying I had it.)

Edit: Wait! I'm even more of an idiot than I thought... I accidentally made it to where metaspritetest2 was going to metasprite1 instead of metasprite2 and I have metasprite2 at rep $20 (8bit?) instead of rep $10. Of course now, the code doesn't even take a minute to crash... (I think the problem was that I pushed to infinity because I didn't jump to the code that would pull.)

Attachment:

MORE REAL Metasprite Demo.rar [238.63 KiB]
Downloaded 180 times

Edit: (Again.) I solved the problem. I wrote sep #$10, rep #$20 instead of rep $10, sep #$20. After changed that, I ran it, used the restroom, then grabbed a soda and it was still running so I think It's safe to say it's alright. :wink:

(By the way, I guess you should never rep right after you sep?)

Also, I guess the fact that I'm only writing inx once is one of the problems? (I thought it would automatically increment by 2 bytes instead of one.) The one thing I really do have to look at (which should be in the book) is the processor status bit things, because I don't have the slightest Idea of what I'm doing.

The reason why I always complain and come here for help is because I don't have the slightest clue as to what any of this means...

Attachment:

SNES9X Debugger.png [ 37.78 KiB | Viewed 3828 times ]

Maybe I should have tried something easier than the SNES for programming first... (I have no background knowledge.)

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-19 (#139641)

Actually, just forget everything I just said. I accidentally replaced the file that was correct with a one that was broken, and I tried to fix it, but I couldn't. I found out that the file that was working was different than the second one I posted. I'm hopelessly confused.

Re: 16bit table indexing problem
by koitsu on 2015-01-19 (#139643)

Espozo wrote:

Edit: (Again.) I solved the problem. I wrote sep #$10, rep #$20 instead of rep $10, sep #$20. After changed that, I ran it, used the restroom, then grabbed a soda and it was still running so I think It's safe to say it's alright. :wink: (By the way, I guess you should never rep right after you sep?)

I covered REP/SEP earlier, including what each bit represents. So let's cover it again, more verbosely. Remember: SEP/REP only touch (i.e. change) the bits you set to 1 in the operand -- all other bits are left alone. (The operand is essentially a mask of sorts). Below is a "chart" for you to use -- I know how SEP/REP work, but honestly I just remember this chart by heart because these are the most common operations you'll see:

Code:

  rep #$10   ; X/Y = 16-bit (set bit 4 of P to 0).  $10 hex = %00010000 binary
  sep #$10   ; X/Y = 8-bit  (set bit 4 of P to 1).  $10 hex = %00010000 binary

  rep #$20   ; M = 16-bit (set bit 5 of P to 0).  $20 hex = %00100000 binary
  sep #$20   ; M = 8-bit  (set bit 5 of P to 1).  $20 hex = %00100000 binary

  rep #$30   ; M = 16-bit, X/Y = 16-bit (set bit 4 and 5 of P to 0).  $30 hex = %00110000 binary
  sep #$30   ; M = 8-bit,  X/Y = 8-bit  (set bit 4 and 5 of P to 1).  $30 hex = %00110000 binary

Using this "chart", you should be able to determine what you did, and why if you want something like 8-bit accumulator + 16-bit indexes that you need to use a combo of SEP and REP -- it cannot be done in a single instruction.

Espozo wrote:

Also, I guess the fact that I'm only writing inx once is one of the problems? (I thought it would automatically increment by 2 bytes instead of one.)

Okay, two things here:

1. The increment opcodes (INC, INY, INX) only increment the register by 1. The "size" of the register has no bearing on "how much" is incremented. Rephrased: INC is equivalent to A++ or A = A +1, INY is equivalent to Y++ or Y = Y+1, INX is equivalent to X++ or X = X+1 (assuming you have some programming knowledge of other languages). The only bearing the register size (8 vs. 16-bit) has on this is when it comes time to increment from $FF to a new value (8-bit would go from $FF to $00, 16-bit would go from $FF to $100) or when incrementing a memory address (more on that at the end).

If you want to increment something by more than 1, you have a couple choices. For the X/Y registers, repeated calls to INX/INY would work (e.g. incrementing X by 2 would be INX / INX), but you should know that from your repeated INY statements within build_metasprite. :/

For incrementing the accumulator, you can use CLC / ADC #value (e.g. incrementing accumulator by 7 would be CLC / ADC #7).

There is no CLC/ADC equivalent that affects X or Y directly. If you wanted to increment X or Y by a larger value and don't want to have to call INX/INY repeatedly, then you have to transfer X or Y into the accumulator, use CLC/ADC, then transfer the accumulator back into X or Y. Example:

Code:

  ldx #4    ; X=$0004
  txa       ; Transfer X into accumulator (A=$0004)
  clc       ; Clear carry
  adc #7    ; Add 7 to accumulator (A=$000B (11 decimal))
  tax       ; Transfer accumulator into X (X=$000B)

You will probably ask: "wouldn't doing TXA mean whatever (previously) was is the accumulator is lost?" -- yup, it does. That's why you'll find people doing PHA/PLA before/after that type of operation quite often, or using a temporary variable.

There is a point where repeated INY/INXs become space-wasting and time-wasting though (i.e. it's better to do something like TXA/CLC/ADC/TAX), but right now I really don't want to cover that because it'll just confuse you more. For now, if you wanted to increment X by, say, 12, then go ahead and do 12 INX instructions. If anyone comes along and says "this is inefficient blah blah blah", tell them you're still learning and to deal with one thing at a time. :-)

Also with INC (but not INX/INY), you can also increment (by 1) a value in memory without having to load it into the accumulator / write it back out. The active size (8 vs. 16-bit) of the accumulator matters here too (and can often cause people confusion, similar to what's happening presently). But again I don't want to cover that because it'll just add more confusion right now. Just know that it's available.

2. Terminology nitpick: you are not "incrementing by 2 bytes". You are incrementing an index register which you are using as an offset during your LDA.

Espozo wrote:

The one thing I really do have to look at (which should be in the book) is the processor status bit things, because I don't have the slightest Idea of what I'm doing.

Well I've explained them twice now, including giving you a chart of essentially the most-commonly-used opcode+operand combinations are (I hope that helps you the most -- as I said, I understand how the opcodes work, but I honestly just remember the chart by heart. :-) ), so if you're still having problems then maybe referring to the WDC book is a better choice. There's a chapter on it, I believe.

One thing I should note: the learning curve you're going through here, with regards to 8-bit vs. 16-bit, is incredibly common, so don't feel alone! A lot of people used to the original 6502 and 65c02 (both 8-bit CPUs), when the 65816 came out, had very similar complexities/difficulties as what you're going through now. Doing 16-bit "stuff" on the 6502/65c02, however, is a lot more painful (IMO). The 65816 really spoils you -- it's one reason I have a harder time doing 6502, because I did a *lot* of my coding on the 65816 so using a native 8-bit CPU often makes me grumble. :-)

As for your SNES9x debugger shot: by default the SNES9x debugger doesn't execute any code, and instead stops at the very first instruction the processor wants to run. Step back for a moment and think about how the CPU "starts up". It starts executing code where the RESET vector points to, right? You should be able to determine where that is easily, and the debugger makes it even easier (there's a "Vector Info" button). The first instruction your program executes when the system starts is SEI. You should be able to find that in your code and go from there.

What you need to do is what I described earlier -- create a breakpoint on the execution of code, essentially right before you do your jsr build_metasprite. But as I said earlier, the debuggers do not currently have a way to "connect" source code to actual running code, so what you're seeing is quite literally what the assembled results are + what the CPU is doing. This is why I said it'd be very useful if emulators actually had a way to drop to the debugger when encountering an instruction (e.g. brk $ff), because then you could just add that line and run the code and then bam, continue from there.

A listing generation from your assembler would tell you exactly what memory address the jsr build_metasprite instruction is at, but WLA DX (as I've said a billion times now) is completely broken in this regard, which makes your job a lot harder. Essentially what you need to do right now is step through (run) every single line of code until you see code that looks very similar to the code around/right before jsr build_metasprite. If you aren't sure, do something that will clue you in: put a bunch of nop statements right before your jsr. Once you see those you'll know where they are, then you'll get an idea of what memory location the JSR is at, then can throw that into the debugger as an execution breakpoint, reset the system (there's a Reset button), click Run, and wait until it drops to the debugger and then you can step through your code instruction by instruction and see what it's doing, how the registers are affected, and so on. :-)

Learning how to use a debugger is a tedious process (about as tedious as programming) because each debugger behaves differently. There is no standard. I'm certain the bsnes debugger is substantially different from the SNES9x debugger, for example. The only one I have familiarity with is the SNES9x debugger, although as discussed in other threads NO$SNS has a debugger as well (although by default that emulator shows opcodes in a non-65xxx-syntax format -- you have to turn that feature off before you get something that's actually sane ;-) ).

Re: 16bit table indexing problem
by koitsu on 2015-01-19 (#139644)

nicklausw wrote:

Visual Studio and a bunch of other crap is needed. Don't use the binaries given on Ville's website, as those are somewhat outdated (and some bug fixes along with improvements have been made since then).

By the assumption that you're using Windows, here's my binaries. Fresh off GitHub. Made them last month as I recall.

EDIT: just remembered I also compiled them for Ubuntu, so I'll put those there aswell. Zip = windows, Tar = Ubuntu.

The Windows binaries you provide in the zip are native 64-bit binaries -- they will not work on 32-bit operating systems. This is unlike the "unofficial" binary builds which are 32-bit. I can't use these binaries (I do not run a 64-bit OS). There's really no reason (that I can think of) for native 64-bit binaries for WLA DX on Windows (on *IX it's a different situation, as not everyone's OSes have 32-bit compatibility shims; for example my FreeBSD boxes are all pure 64-bit with absolutely no 32-bit binary support). Windows 64-bit OSes provide 32-bit compatibility shims by default, so 32-bit is a better choice there for something non-memory-intensive like 65xxx assemblers.

The only reason I'm bothering to try your binaries is to see if the listing generation stuff has been fixed. "And some bug fixes along with improvements have been made since then" isn't precise enough -- this is exactly what a ChangeLog or commit history is for. I wouldn't need to try the binaries if I could see that + know exactly what commit and/or branch your binaries were based off of. :-)

Re: 16bit table indexing problem
by nicklausw on 2015-01-19 (#139646)

koitsu wrote:

nicklausw wrote:

Visual Studio and a bunch of other crap is needed. Don't use the binaries given on Ville's website, as those are somewhat outdated (and some bug fixes along with improvements have been made since then).

By the assumption that you're using Windows, here's my binaries. Fresh off GitHub. Made them last month as I recall.

EDIT: just remembered I also compiled them for Ubuntu, so I'll put those there aswell. Zip = windows, Tar = Ubuntu.

The Windows binaries you provide in the zip are native 64-bit binaries -- they will not work on 32-bit operating systems. This is unlike the "unofficial" binary builds which are 32-bit. I can't use these binaries (I do not run a 64-bit OS). There's really no reason (that I can think of) for native 64-bit binaries for WLA DX on Windows (on *IX it's a different situation, as not everyone's OSes have 32-bit compatibility shims; for example my FreeBSD boxes are all pure 64-bit with absolutely no 32-bit binary support). Windows 64-bit OSes provide 32-bit compatibility shims by default, so 32-bit is a better choice there for something non-memory-intensive like 65xxx assemblers.

The only reason I'm bothering to try your binaries is to see if the listing generation stuff has been fixed. "And some bug fixes along with improvements have been made since then" isn't precise enough -- this is exactly what a ChangeLog or commit history is for. I wouldn't need to try the binaries if I could see that + know exactly what commit and/or branch your binaries were based off of. :-)

I just did the classic ol' https://github.com/vhelin/wla-dx/tarball/master, last day of last year. Also was not aware the binaries were 64 bit...will see if I can compile 32-bit.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-19 (#139647)

Right now, the main problem isn't trying to convert the code to 16bit, because I need to first carry the information so when I do convert the code (If I can) part of the memory won't be erased. I'm about to show you this (don't worry! This doesn't take two hours.) because it makes 0 sense to me. I have the normal code:

Attachment:

Metasprite1.png [ 22.87 KiB | Viewed 3791 times ]

and I have the code that is exactly the same, except I added phy, phx, plx, and ply:

Attachment:

Metasprite2.png [ 23.53 KiB | Viewed 3791 times ]

The results...:

Attachment:

Results....png [ 24.59 KiB | Viewed 3791 times ]

This has made me extremely frustrated. I 1. made sure I pushed and pulled the same amount of times and 2. pulled in the reverse order I pushed, but I doesn't work.

Sorry for relying on you so heavily, koitsu, it's just that, like I said, I looked over it and still don't see any problems. One thing I wonder is if those php and plp things have to do with it... (They both start with P like phx or ply?

)

Re: 16bit table indexing problem
by Movax12 on 2015-01-19 (#139648)

Pushing, pulling use the stack and so does JSR. (Unless there is something I don't know abut 65816, you are also PHP, then PLX.)

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-19 (#139650)

How would JSR affect the stack and why does it even use the stack? Does it store the address it jumps back to?

Re: 16bit table indexing problem
by Movax12 on 2015-01-19 (#139651)

Exactly.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-19 (#139652)

I know I'm not "thinking for myself" but what should I do differently now?

Re: 16bit table indexing problem
by Movax12 on 2015-01-19 (#139653)

Don't use the stack to store your values, or use stack relative addressing to access those values and remember to fix the stack up when you are done.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-19 (#139656)

Because using the stack has pretty much been eliminated, I tried using registers like I tried originally, but I still can't get it to work.

This is the code for the main file (TempX and TempY are where X and Y are being stored.):

Attachment:

MetaspriteTest2.png [ 23.66 KiB | Viewed 4002 times ]

And this is the code for the metasprite drawing routine:

Attachment:

Metasprite2.png [ 14.62 KiB | Viewed 4002 times ]

And this is the result...(The desired outcome is on my earlier post.):

Attachment:

Result....png [ 19.64 KiB | Viewed 4002 times ]

The result is actually the exact same as skipping the metasprite routine, leading me to believe that it jumps to the metasprite routine, but the first number in the table somehow turns into a 0. Because of this, right when the metasprite routine starts, it ends, because I saw a 0, making it think it was done. (This is how I tell it how many sprites to draw for the metasprite.) The problem is (obviously) I don't know how it got 0...

Re: 16bit table indexing problem
by koitsu on 2015-01-19 (#139663)

Movax12 is partially right and wrong, but I'll cover it all.

I was going to write up a very long explanation, but it would involve me converting all the assembly code into machine language (raw bytes) -- especially important to show you what JSR is doing under the hood (specifically how the value the CPU pushes onto the stack is the address of the last byte of the opcode).

The root problem has to do with stack operations and order-of-operation. You're not correctly keeping track of what was last pushed on the stack.

A real-time debugger would show you this -- you'd see "weird values" show up in the X and Y registers, followed by when your code hit the PLP and RTS, things freaking out horribly bad.

So let's go over the code and how it executes -- again, a listing here would be SUPER helpful. Let's assume S (stack pointer) is at $03FF when this code starts, and that both indexes and accumulator are 16-bit. Also assume MetaspriteTable happens to be at memory location $9150, just for the hell of it (you'll see why this is important in a moment).

Code:

  ldy #$00
  ldx #MetaspriteTable
  phy
  phx
  jsr start_metasprite
  php
  ;
  ; At this point, the following values are on the stack (in memory):
  ;
  ; $03FF = $00 (high byte of Y, from PHY)
  ; $03FE = $00 (low byte of Y, from PHY)
  ; $03FD = $91 (high byte of X, from PHX)
  ; $03FC = $50 (low byte of X, from PHX)
  ; $03FB = high byte of address of last byte of operand of jsr start_metasprite call -- can't tell you what this is without a listing
  ; $03FA = low byte of address of last byte of operand of jsr start_metasprite call -- can't tell you what this is without a listing
  ; $03F9 = whatever P was when PHP was called
  ;
  ; S should be $03F9 at this point too (if my memory serves me right).
  ;
  rep #$10
  sep #$20
  plx        ; BUG BUG BUG
  ply        ; BUG BUG BUG
  ;
  ; X low byte = whatever $03F9 contained -- this would be the value pushed by PHP
  ; X high byte = whatever $03FA contained -- see above
  ;
  ; Y low byte = whatever $03FB contained -- see above
  ; Y high byte = whatever $03FC contained -- this would be $50, low byte of X when PHX was called
  ...
  ...
  plp        ; BUG BUG BUG
  ;
  ; P = whatever $03FD contained -- this would be $91, high byte of X when PHX was called
  ;
  rts        ; BUG BUG BUG
  ;
  ; The PC (program counter, i.e. where the CPU is currently executing) is going to be set to $0001 here.
  ; The value comes from $03FE and $03FF.  On an RTS, the CPU also increments the address it pulled
  ; by one (this would be "the instruction at the location after the last byte of the operand" -- what I
  ; was describing above). $0000 + 1 = $0001.
  ;
  ; Program at this point goes utterly bonkers, trying to execute (as code) whatever values are at $0001
  ; and onward.
  ;

I think that's all the time I'm going to spend today on this. I think so far I've spent around 5-6 hours of my time just on this one thread. This is about the point where I would start discussing with someone financial reimbursement for time + training sessions.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-19 (#139664)

(You don't have to respond today, but...) I guess using the stack for this is generally a no no? Like I said, I did try it by loading the values in registers, but it didn't work either for some reason. (Granted, not nearly as bad.) Also, If you use SNES9X debugger, do you think you could point me to a tutorial or something that can teach you how to use it? :oops:

Mainly, what I gather is that instead of pulling
; $03FC = $50 (low byte of X, from PHX) and
; $03FD = $91 (high byte of X, from PHX) into X and pulling
; $03FE = $00 (low byte of Y, from PHY) and
; $03FF = $00 (high byte of Y, from PHY) into Y, it's pulling

; $03F9 = (whatever P was when PHP was called) and
; $03FA = (low byte of address of last byte of operand of jsr start_metasprite call) into X and pulling
; $03FB = (high byte of address of last byte of operand of jsr start_metasprite call) and
; $03FC = $50 (low byte of X, from PHX) into Y, effectively screwing everything everything up.
(Metasprite information is wrong, it's writing to the OAM table offset by whatever, and it's returning to a totally wrong location.)

It seems that the stack is a bit more finicky than I had originally thought.

It seems easier to just load values into registers. (After all, you do have 128KB of RAM.)

Re: 16bit table indexing problem
by koitsu on 2015-01-19 (#139666)

You can use temporary variables or the stack. Temporary variables tend to be easier to comprehend, and the SNES (IMO) has a lot of RAM. The choice is yours. There's no wrong way. I think at this point though using temporary variables in direct page/RAM would be easier for you.

The issue with the stack is that because of the use of a subroutine via JSR, the last 2 bytes pushed onto the stack are going to be the address of the last byte of the operand of the JSR, which isn't what you're wanting via PLX/PLY.

That said, your code still had a bug relating to doing PHP followed by a (16-bit) PLX, so that's responsible for one of two issues.

Movax12 mentioned stack-based addressing, which is something the 65816 has (6502/65c02 does not have this). I don't know what the exact syntax is in WLA DX for it, but the common syntax for addressing is n,s where n is an "offset" that starts at the active stack pointer (S) and works backwards -- it's kinda like indexing using n,x or n,y just "via the stack" (which works backwards due to the fact that S decrements, not increments). Stack-relative stuff consists of a 2-byte instruction, which means it can only address up to 256 bytes "backwards from S". It also does not modify the stack, so it's still your responsibility to pull data off (to keep from having a stack overflow) later. Some example code:

Code:

  rep #$20
  lda #$1234
  pha
  jsr myroutine
  ...
  ...
  pla      ; Pull previous data (PHA) off stack (keep stack overflow from happening)
loop:
  bra loop

myroutine:
  php
  rep #$10
  lda $4,s   ; Load 16-bit data pushed onto the stack via PHA earlier, into accumulator
             ; $3,s would be one of the bytes of the JSR address
             ; $2,s would be one of the bytes of the JSR address
             ; $1,s would be the value of P pushed via PHP
  ...
  plp
  rts

The thing to remember about stack-based addressing is that the "last byte pushed on the stack" would be $1,s, because S always points to the "next" available spot. You shouldn't ever do something like $0,s.

It's believed that the main reason this mode was implemented was solely for things like subroutines that mimic C or other languages where arguments passed to a function are pushed on to the stack prior to the function (subroutine) being used.

One downside to stack-based addressing (and this upset a lot of people at the time): the number of opcodes supporting this addressing mode are very few, and all relate to the accumulator. You thus cannot do something like ldx $2,s (there is no such opcode supporting that addressing mode).

But as I said near the start: I think right now, using temporary variables would keep things simpler for you. No judgement there whatsoever -- you're still learning, so it's perfectly fine. :-)

P.S. -- I myself have never used the stack-relative addressing mode. This is the first time I've actually taken the time to fully look at it and take a shot at some example code using it. Here be dragons.

Re: 16bit table indexing problem
by Movax12 on 2015-01-20 (#139670)

koitsu wrote:

Movax12 is partially right and wrong.

Which part was wrong?

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-20 (#139684)

Just wondering because, like I said, the loading and storing approach didn't work, I wonder, how does the table actually get loaded into ram? The table is 7 bytes long, so does it fill the register and 7 after it with information? How can the processor even hold the table if it can only hold 16 bits? (I'm sure my way of thinking things is way off.)

Re: 16bit table indexing problem
by koitsu on 2015-01-20 (#139686)

Movax12 wrote:

koitsu wrote:

Movax12 is partially right and wrong.

Which part was wrong?

The post implied that the use of JSR was "responsible" for the bug -- only partially true (the PHP + PLX bug is the other half). That's how I read it anyway.

Re: 16bit table indexing problem
by koitsu on 2015-01-20 (#139687)

Espozo wrote:

Just wondering because, like I said, the loading and storing approach didn't work, I wonder, how does the table actually get loaded into ram? The table is 7 bytes long, so does it fill the register and 7 after it with information? How can the processor even hold the table if it can only hold 16 bits? (I'm sure my way of thinking things is way off.)

What do you think the loop in build_metasprite is for? I'm starting to get the impression someone gave you a bunch of code you don't understand (how it works). Did someone give you this code? If so, it might be better to discuss with them this type of question, since they're the author. This is exactly why taking code snippets from people + using them (without understanding them) is a Bad Practise(tm) (that's my opinion).

Regarding "tables of data" -- a series of bytes (8-bit values) is the same thing as a series of words (16-bit values). They're just bytes of data in memory. How they're accessed/used is up to the actual program/code (in real-time). I explained this in an earlier post, re: how you were incrementing indexes by 1 and then accessing the table + offset that way (i.e. a byte at a time), rather than a word at a time. (I still think that code is buggy/broken, but I am intentionally not fixing it for you because teaching you how to fix it yourself is more important)

The .db and .dw assembler directives simply tell the assembler "stick some raw data in right here". The only difference is that .dw allows for convenient storing of data in little endian format (low byte first, high byte second). E.g. .dw $1234 would be the same as .db $34,$12, assuming one is accessing the data via a 16-bit read in the code.

Re: 16bit table indexing problem
by Movax12 on 2015-01-20 (#139690)

koitsu wrote:

The post implied that the use of JSR was "responsible" for the bug -- only partially true (the PHP + PLX bug is the other half). That's how I read it anyway.

I mentioned the PHP+PLX pair as well.

Re: 16bit table indexing problem
by psycopathicteen on 2015-01-20 (#139695)

When it goes into the "infinite loop" part, are the index registers in 16-bit mode? Also, I think "ldy #$00" might have issues with assemblers if it is used in 16-bit mode, since it is written as an 8-bit number. Better write it as "ldy #$0000" to make sure it assembles correctly.

Code:

InfiniteLoop:
wai

php  ;;I don't know what mode it was in, but this saves it on stack

rep #$20
ldy #$0000
ldx #MetaspriteTable
sty YTemp
stx XTemp
jsr start_metasprite

plp  ;;This returns to whatever mode it was, assuming that it is in the correct mode for the rest of the code

Re: 16bit table indexing problem
by 93143 on 2015-01-20 (#139698)

psycopathicteen wrote:

When it goes into the "infinite loop" part, are the index registers in 16-bit mode? Also, I think "ldy #$00" might have issues with assemblers if it is used in 16-bit mode, since it is written as an 8-bit number. Better write it as "ldy #$0000" to make sure it assembles correctly.

I think WLA DX is probably smart enough to figure it out, but if it's not, I doubt that will fix it. I've had trouble with this sort of thing when using absolute addressing; lda $0001 assembles as "A5 01" because the assembler defaults to the smallest byte count capable of expressing the number. To get "AD 01 00", I had to specify the operand length as word, as recommended in the readme: lda $0001.w. I suspect lda $01.w would have worked too. (The length suffix can also be put on the opcode instead of the operand; e.g. lda.w - in some cases I found appending ".l" to the opcode to be the only way to get the assembler to use the full 24-bit value of a label.)

Now, I was using the version from 2003, and it's possible the thing is smarter now, but the readme on the website still recommends using ".b", ".w", or ".l" to resolve ambiguity. The example given is and with an immediate value, which implies that it's not just addresses that might need this treatment.

Re: 16bit table indexing problem
by koitsu on 2015-01-20 (#139700)

Re: concerns over ldy #$00:

It's being assembled correctly, or was at the time of that post. You can clearly see the results being $A0 00 00 which is correct (16-bit load of $0000 into Y).

Once again (is this the 5th or 6th time? I've lost track) we're back at the need for generated assembly listings and why using an assembler that generates ones properly is important, especially when learning.

Furthermore, and I don't know why this isn't being discussed: forcing the size using .w does not guarantee that the code will function correctly. For example, telling someone to "just use .w" is just as error-prone (if not more so!) than letting the assembler make its own educated guess. For example, take this code (and I'll include the machine language for it because it's important here):

Code:

E2 20     sep #$20
          ;
          ; Programmer does something else here, using 8-bit accumulator, and forgets that it's 8-bit.
          ; Programmer has been taught "to use .w when referring to 16-bit values or addresses"
          ;
A9 00 00  lda.w #$0000
8D 28 EA  sta.w $ea28

But during run-time, this code gets executed as the following:

Code:

E2 20     sep #$20
A9 00     lda #$00
00 8D     brk $8d
28        plp
EA        nop

I don't think WLA DX is going to dynamically change the size of the operands ($00 vs. $00 00) if you do lda #$00 vs. lda #$0000 or even lda #0 (that's a good example of an ambiguous one) -- that's what using .b/w is for -- instead it (very likely) keys off of what the accumulator size is per SEP/REP. I'd check, but there's no way in hell I'm going to bother trying to use assembly listings in WLA DX after what I've experienced this past week.

TL;DR -- Listings generated by the assembler: important. Programmer understanding what their code does, and understanding the processor: equally as important. Forcing sizes using .b/w/l can result in bad learned behaviour. In other words: aren't sure if your code is generated correctly? Look at the listing. Aren't sure if it's executing correctly? Use a debugger. (And for those wondering how we used to debug code on the SNES during the early 90s when there was no such thing as an emulator -- we stared at our code until we found the bug. Really. At least on the IIGS we had GSBug which is still one of my favourite debuggers, only rivalled by SoftIce).

Re: 16bit table indexing problem
by 93143 on 2015-01-20 (#139701)

I didn't mean to imply that it should be standard practice in all cases. For absolute or long addresses with small numerical values, it might be wise, because you know what addressing mode you want and the assembler is likely to screw it up.

But if you're using immediate values, yes, it's probably better to hold off on using length-forcing suffixes until it's clear you need them.

Re: 16bit table indexing problem
by koitsu on 2015-01-20 (#139702)

Nod, understood, and agreed. My post was mainly for psycopathicteen.

The issues here are compounded by the fact that WLA DX has awful documentation (both in organisation and grammar/phrasing), which just makes things even harder for someone new. I often brag about ORCA/M (Apple IIGS assembler) because its documentation is remarkably good (I can't tell you how many times during my IIGS days the docs saved me pain). Merlin 8/16's documentation (also Apple IIGS) is pretty good too. And so was x816s'.

All this makes me wonder if I could find Norman Yen (x816 author) and ask him to release the source for it so Win32 (rather than DOS) binaries could be built. It was written in Turbo Pascal, which could very easily be ported to C.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-20 (#139706)

koitsu wrote:

Espozo wrote:

Just wondering because, like I said, the loading and storing approach didn't work, I wonder, how does the table actually get loaded into ram? The table is 7 bytes long, so does it fill the register and 7 after it with information? How can the processor even hold the table if it can only hold 16 bits? (I'm sure my way of thinking things is way off.)

What do you think the loop in build_metasprite is for? I'm starting to get the impression someone gave you a bunch of code you don't understand (how it works). Did someone give you this code? If so, it might be better to discuss with them this type of question, since they're the author. This is exactly why taking code snippets from people + using them (without understanding them) is a Bad Practise(tm) (that's my opinion).

I don't remember exactly how the code came to be, but I know the answer lies in the SNES Programming Help thread. If I remember correctly, it's part of a much more complicated (and useful) code psychopathicteen wrote that I edited a bit for me. I don't know exactly what's going on, but I think I have a fairly good idea. (Accumulator should be 8 bit, and X and Y should be 16 bit.)

Code:

   ldy #$00      ;says 0 sprites have been made yet
   ldx #MetaspriteTable   ;where MetaspriteTable is in Bank0?
   sty YTemp      ;store y to YTemp
   stx XTemp      ;store x to XTemp
   jsr start_metasprite   ;jump to start_metasprite to build a metasprite

Code:

.BANK 0 SLOT 0
.ORG 0
.SECTION "MetaspriteCode"

start_metasprite:
   php      ;no clue
   rep #$10   ;prepares to make the accumulator 8bit by crearing the processor status bits?
   sep #$20   ;makes the accumulator 8bit?
   ldy YTemp   ;load #$00 into Y, as this offsets the Sprite Buffer and no sprites have been made yet
   ldx XTemp   ;the value of where the table starts is being loaded into X

build_metasprite:
   lda $00,x      ;I'm not sure why we are loading $00, (I looked in a memory editor and it is 0) but we are offseting $00 by X, to see where we are in the table
   beq metasprite_done   ;If the number in the table was 0, (designed to mean there are no more sprites in the metasprite) then jump to metasprite_done to jump back to the main code.
   inx         ;increase x by 1 to look at the next byte in the table (sprite x position)
   lda $00,x      ;we are offseting $00 by X, to see where we are in the table
   sta SpriteBuf1,y   ;storing the byte (for sprite y position) in SpriteBuf, offset by y (y gets incremented by 4 because each sprite has 4 bytes)
   inx         ;increase x by 1 to look at the next byte in the table (sprite y position)
   lda $00,x      ;we are offseting $00 by X, to see where we are in the table
   sta SpriteBuf1+1,y   ;storing the byte (for sprite y position) in SpriteBuf, offset by y (y gets incremented by 4 because each sprite has 4 bytes)
   inx         ;increase x by 1 to look at the next byte in the table (to see if there is another sprite in the metasprite)
   iny         ;y gets incremented by 4 because each sprite has 4 bytes
   iny
   iny
   iny
   bra build_metasprite   ;go back to build_metasprite to potentially build another sprite in the metasprite

metasprite_done:
   plp      ;no clue
   rts      ;jump back to the main code

.ENDS

I think the problem resides with the fact of how I'm storing X and Y when I jump to the metasprite building code and how I'm loading it when I get there, because like I said, if I just get rid of the storing and loading into XTemp and YTemp, it works fine. (X and Y are supposed to be 16 bit.)

(I know I'm probably frustrating you by this point...)

Re: 16bit table indexing problem
by psycopathicteen on 2015-01-20 (#139711)

It looks like it should work. Post the latest ROM so I can trace it in a debugger and figure out what's wrong.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-20 (#139712)

Thank you. I'm literally dumbfounded. MetaspriteTest1 yields the correct result, while MetaspriteTest2 doesn't, which is the one that uses XTemp and YTemp. The metasprite routine for both is Metasprite and Metasprite2 respectively. If you have any questions, don't hesitate to ask.

Attachment:

MetaspriteDemo.rar [3.83 MiB]
Downloaded 409 times

Edit: I accidentally increased Y by one more than I should have in Metasprite2, so just get rid of one of them after you download it. It still doesn't work though...

Re: 16bit table indexing problem
by psycopathicteen on 2015-01-21 (#139715)

I just learned something new. WLA is a total piece of crap.

Quote:

0082ea ldy #$0000 A:0000 X:0000 Y:0000 S:1fff D:0000 DB:00 nvMxdiZC V:230 H: 30
0082ed ldx #$8401 A:0000 X:0000 Y:0000 S:1fff D:0000 DB:00 nvMxdiZC V:230 H: 54
0082f0 sty $1808 [001808] A:0000 X:8401 Y:0000 S:1fff D:0000 DB:00 NvMxdizC V:230 H: 78
0082f3 stx $180a [00180a] A:0000 X:8401 Y:0000 S:1fff D:0000 DB:00 NvMxdizC V:230 H: 118
0082f6 jsr $820e [00820e] A:0000 X:8401 Y:0000 S:1fff D:0000 DB:00 NvMxdizC V:230 H: 158
00820e php A:0000 X:8401 Y:0000 S:1ffd D:0000 DB:00 NvMxdizC V:230 H: 204
00820f rep #$10 A:0000 X:8401 Y:0000 S:1ffc D:0000 DB:00 NvMxdizC V:230 H: 226
008211 sep #$20 A:0000 X:8401 Y:0000 S:1ffc D:0000 DB:00 NvMxdizC V:230 H: 248
008213 ldy $08 [000008] A:0000 X:8401 Y:0000 S:1ffc D:0000 DB:00 NvMxdizC V:230 H: 270
008215 ldx $0a [00000a] A:0000 X:8401 Y:0000 S:1ffc D:0000 DB:00 nvMxdiZC V:230 H: 302
008217 lda $00,x [000000] A:0000 X:0000 Y:0000 S:1ffc D:0000 DB:00 nvMxdiZC V:230 H: 334
008219 beq $822e [00822e] A:0000 X:0000 Y:0000 S:1ffc D:0000 DB:00 nvMxdiZC V:230 H: 364
00822e plp A:0000 X:0000 Y:0000 S:1ffc D:0000 DB:00 nvMxdiZC V:230 H: 386
00822f rts A:0000 X:0000 Y:0000 S:1ffd D:0000 DB:00 NvMxdizC V:230 H: 414

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-21 (#139728)

Wait a minute, did it load from the completely wrong register from what I stored in? Both were defined as XTemp and YTemp... I guess I'm not completely crazy after all! :roll:

I guess this would explain how it got the number 0 and exited the routine. (I guess it's time to switch to a new assembler?)

Re: 16bit table indexing problem
by Sik on 2015-01-21 (#139729)

That's about as bad as that time I tried to use asmx and it somehow managed to rotate the order of the bytes in an entire 16 byte block (inevitably resulting in a crash for obvious reasons).

Re: 16bit table indexing problem
by tomaitheous on 2015-01-21 (#139732)

psycopathicteen wrote:

I just learned something new. WLA is a total piece of crap.

It's a secret to everybody.

Re: 16bit table indexing problem
by koitsu on 2015-01-21 (#139734)

Espozo wrote:

Wait a minute, did it load from the completely wrong register from what I stored in? Both were defined as XTemp and YTemp... I guess I'm not completely crazy after all! :roll: I guess this would explain how it got the number 0 and exited the routine. (I guess it's time to switch to a new assembler?)

The root cause of the bug is unknown (I have a couple theories -- I'll spend a little time testing them out tonight), and I could do some testing to see what the heck is going on, but I'm choosing not to. Purely for educational purposes for you -- the code:

Code:

  rep #$10
  sep #$20
  ldy $180a
  ldx $1808

Should have assembled into:

Code:

C2 10     rep #$10
E2 20     sep #$20
AC 0A 18  ldy $180a
AE 08 18  ldx $1808

But instead the assembler somehow decided the index register size was 8-bit (I think?) and came out with this, silently discarding the upper byte of the address entirely, and thus chose to "optimise" by using direct page addresses:

Code:

C2 10     rep #$10
E2 20     sep #$20
A4 0A     ldy $0a     ; Effectively same as ldy $000a
A6 08     ldx $08     ; Effectively same as ldx $0008

I can tell it's using direct page because the opcodes+operands according to psychopathicteen's debugger output shows only 2 bytes used (if they were 16-bit absolute addresses, but wrong, they'd be 3 bytes).

Needless to say, $08 != $1808 and $0a != $180a. You didn't store your temporary values in direct page, you stored them in RAM (as the STX/STY lines clearly indicate), so the LDY/LDX statements here would result in incorrect values being loaded. Who knows what's in direct page at that point!

What's funny about all of this: this might explain why long, LONG ago, when I looked at some code someone wrote in WLA DX, I'd see what I considered "intermittent" usage of the .w modifier (meaning use at times where the assembler obviously should have gotten the right thing, thus was implied). That never sat well with me. Christ, I'm having horror flashbacks...

Re: 16bit table indexing problem
by tepples on 2015-01-21 (#139736)

[benefit-of-the-doubt]
Was there lda #$1800 tad beforehand? Perhaps it's attempting to track D the same way it tracks bits 5-4 of P.
[/benefit-of-the-doubt]

Re: 16bit table indexing problem
by koitsu on 2015-01-21 (#139737)

You mean tcd? Nope, no where in that code. What psychopathicteen posted here is what literally got run. And if there was a tcd (earlier in the program), it wouldn't explain why STY/STX used absolute addressing while LDY/LDX didn't (since neither used the .w to force 16-bit addresses).

Good theory/thinking nonetheless!

Re: 16bit table indexing problem
by nicklausw on 2015-01-21 (#139741)

koitsu wrote:

nicklausw wrote:

Visual Studio and a bunch of other crap is needed. Don't use the binaries given on Ville's website, as those are somewhat outdated (and some bug fixes along with improvements have been made since then).

By the assumption that you're using Windows, here's my binaries. Fresh off GitHub. Made them last month as I recall.

EDIT: just remembered I also compiled them for Ubuntu, so I'll put those there aswell. Zip = windows, Tar = Ubuntu.

The Windows binaries you provide in the zip are native 64-bit binaries -- they will not work on 32-bit operating systems. This is unlike the "unofficial" binary builds which are 32-bit. I can't use these binaries (I do not run a 64-bit OS). There's really no reason (that I can think of) for native 64-bit binaries for WLA DX on Windows (on *IX it's a different situation, as not everyone's OSes have 32-bit compatibility shims; for example my FreeBSD boxes are all pure 64-bit with absolutely no 32-bit binary support). Windows 64-bit OSes provide 32-bit compatibility shims by default, so 32-bit is a better choice there for something non-memory-intensive like 65xxx assemblers.

The only reason I'm bothering to try your binaries is to see if the listing generation stuff has been fixed. "And some bug fixes along with improvements have been made since then" isn't precise enough -- this is exactly what a ChangeLog or commit history is for. I wouldn't need to try the binaries if I could see that + know exactly what commit and/or branch your binaries were based off of. :-)

Update: Decided to download the newest sources and try out a 32-bit distribution of cygwin to see if it could compile it. It did, and I'd like you to test these binaries out.

Re: 16bit table indexing problem
by tepples on 2015-01-21 (#139742)

koitsu wrote:

You mean tcd?

Yes, a well-known synonym which I seem to remember came from WDC's datasheet. I use tad because it doesn't confuse "C" (BA) with "C" (bit 0 of P). Given the existence of xce (exchange carry with emulation); tcd looks like it could stand for "copy carry to decimal mode".

Is Cygwin needed, or would MinGW work as well? MinGW doesn't need quite as big of a C runtime DLL because it doesn't try to implement a huge swath of POSIX. Instead, it uses MSVC 6's DLL.

Re: 16bit table indexing problem
by koitsu on 2015-01-21 (#139745)

nicklausw wrote:

Update: Decided to download the newest sources and try out a 32-bit distribution of cygwin to see if it could compile it. It did, and I'd like you to test these binaries out.

I can verify the binaries work on a 32-bit XP box (and I personally have no issues with having cygwin1.dll included with it, but absolutely understand Tepple's question/point too). Thanks a ton!

@Espozo, can you please give me a copy of your current source code (and tell me which filename I should be using for testing) so I can poke around and see if I can figure out this WLA DX bug? I want to see if .INCLUDE + listings are fixed (I've reviewed the commit history on github and I doubt it, but the commit messages are not always that great).

Re: 16bit table indexing problem
by koitsu on 2015-01-21 (#139746)

tepples wrote:

Yes, a well-known synonym which I seem to remember came from WDC's datasheet. I use tad because it doesn't confuse "C" (BA) with "C" (bit 0 of P). Given the existence of xce (exchange carry with emulation); tcd looks like it could stand for "copy carry to decimal mode".

Maybe it has to do with age or era, but I was taught that on the 65816, the accumulator is represented by 3 separate letters depending on usage context: A represented the lower byte (of the 16-bit value), B represented the upper byte (of 16-bit value), and C represented the full 16-bit (of B+A). A, in 16-bit mode, could also refer to the same thing as C -- and mainly because it's compatible with the 65c02, otherwise they would have had to extend the opcode range from a single byte ($00-FF) to something larger, simply to have separate opcodes like lda vs. ldb vs. ldc. If you want proof of the A/B nomenclature, I point you to the xba and tcs/tsc opcodes. I believe the Eyes/Lichty/WDC book goes over this naming convention too. But as I've said before I spent more time doing 65816 than I did 6502/65c02, but the conventions made perfect sense to me.

tcd to me always meant "transfer (copy) C into D" where D = direct page (and its inverse, tdc), simply because there's no purpose I can think of for a person wanting to transfer the carry bit (c of P) into the decimal mode bit (d of P). Likewise, opcodes named cl* and se* tend to refer to bits of P, not registers.

So all that said: what does tsb make you think of? Test Stack pointer against B (accumulator)? Test Snakes against b of P (only applicable to emulation mode)? Same for trb. Same for stp (which I actually use semi-often).

I think our experiences/minds just differ here. I basically just remember what an opcode does and remember what opcode correlates with that task -- my brain is just a big lookup table/chart. That's just how my brain works. Maybe yours is different. For example, you know the clever little mnemonics used for memorisation of Perl variables? I find them ridiculous and confusing (the little mnemonics at the end of each definition). I just simply remember what variable does what, and if I can't remember, I refer to documentation.

Footnote: the more I review this WDC PDF, the more massive/major typos I find. For example, the opcode definition for TAY says TAX. *sigh* Maybe that's why they pulled the PDF from their site -- they have too many typos. I really need to spend the money and just get a hard copy of their manual from them. I'd really love a 3-ring-binder version. But the actual Eyes/Lichty book doesn't have any of these mistakes, so they're purely something WDC induced during the PDF conversion (possibly OCR mistakes -- and if so, then whoever did the proofreading should be fired. This is like the 6th or 7th mistake I've found).

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-21 (#139747)

Sure thing! Here it is:

Attachment:

MetaspriteDemo.rar [1.04 MiB]
Downloaded 404 times

koitsu wrote:

(and tell me which filename I should be using for testing)

You're just talking about the main file for the demo that's not working, right? If so, it's MetaspriteTest2 (No names have been changed since the last couple of times I've posted.) Also, the new WLADX that nicklausw posted (thank you!) is in there and I assembled the file using it, but it still didn't work so I guess the bug is still present. Disgraceful!

Also, I know I'm going to get hammered for asking this but, what is direct paging? :oops:

Isn't it just 24 bit addressing? (Can't you access the entire SNES memory space with 24 bits and a single bank with 16?)

Edit: According to Jay's ASM tutorial on superfamicom.org, it says:

Quote:

The direct page register is a pointer that points to a region within the first 64k of memory. This register is used to access memory in direct addressing modes. In direct addressing mode, a 8-bit value (0-255) is added to the direct page address, which will form an effective address.

Doesn't that mean you can only access half of the largest rom size available? (128k I think?)

Re: 16bit table indexing problem
by tepples on 2015-01-21 (#139749)

A "bank" is 65536 bytes. "Direct page" is the ability to relocate the 65816's counterpart to the 6502's zero page anywhere in bank $00. Direct page addressing modes end up behaving more like a frame pointer (like EBP on x86), even if 6502 fans set it to $0000 for familiarity.

Re: 16bit table indexing problem
by 93143 on 2015-01-21 (#139752)

Espozo wrote:

Quote:

The direct page register is a pointer that points to a region within the first 64k of memory. This register is used to access memory in direct addressing modes. In direct addressing mode, a 8-bit value (0-255) is added to the direct page address, which will form an effective address.

Doesn't that mean you can only access half of the largest rom size available? (128k I think?)

Okay, first off, 128 kB is nowhere near the largest ROM size available (official games got as big as 6 MB, and I believe the largest maps offered by Nintendo were 8 MB, but neither of those represented a practical limit - there's a hack of Star Ocean that removes the S-DD1 dependency at the cost of doubling the game's size to 12 MB, and as the linked post says you can exceed that if you try). The CPU's work RAM is 128 kB, though... and it's true that bank $00 is what you might call a LoROM bank, in which you can typically only access half a bank worth of ROM regardless of memory mode because the bottom half is mirrored RAM and system registers and reserved areas... and it's also true that the gross size of the memory map is 128 Mbit, disregarding mirrors and such...

Second, yes, this means that you can only set up direct page to access the first bank. If you set D to a 16-bit value, then the direct page is the 256 bytes of memory starting at that absolute address in bank $00. If, for instance, D is $0000, the direct page is the first 256 bytes of WRAM (since the first 8 kB of WRAM are mirrored to the bottom of banks $00-$3F and $80-$BF). If D is $2100, the direct page corresponds to the B bus address range and can be used to access the PPU and such.

Direct page allows you to use 8-bit addressing, which saves a cycle when loading DP-addressed instructions. Unfortunately, if the pointer in D has a nonzero low byte, you lose a cycle again adding it to the instruction address, and it's no faster than just using absolute addresses. So if you're going to use direct page, it's best to keep D to values of $xx00 unless you're using it for reasons other than speed.

Re: 16bit table indexing problem
by nicklausw on 2015-01-21 (#139754)

tepples wrote:

Is Cygwin needed, or would MinGW work as well? MinGW doesn't need quite as big of a C runtime DLL because it doesn't try to implement a huge swath of POSIX. Instead, it uses MSVC 6's DLL.

I suppose it could be used, but it didn't work well with the batch file given with the sources. Tomorrow I'll try out Cygwin's distribution of it.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-21 (#139755)

93143 wrote:

Okay, first off, 128 kB is nowhere near the largest ROM size available

Oops...

(I was thinking megabits, not kilobits.) I thought the uncompressed Star Ocean hack was 128 megabits, not 96. (12x8=96, obviously)

About the direct page and stuff, isn't the assembler (supposed to) deal with this stuff automatically? The only manual input I remember is when it told me that bra didn't work because the number was too high or something, so I switched to brl and it worked fine. (bra is 8 bit, while brl is 16 bit?)

Also, I guess it would probably be good to learn about direct paging and stuff?

(You know, incase the assembler freaks out... :roll:

)

Re: 16bit table indexing problem
by 93143 on 2015-01-21 (#139759)

Espozo wrote:

About the direct page and stuff, isn't the assembler (supposed to) deal with this stuff automatically?

No, for two reasons. First, if the direct-page pointer D isn't zero, there will be a constant offset between the direct-page and absolute addressing modes; this is by design and should not be compensated for. Second, absolute addressing works within the bank selected by the data bank register, while direct-page always works in bank $00, so in the general case the data accessed won't be the same even if D is zero.

Suppose you had D set to $2100, for fast PPU access (not something the assembler can do automatically, BTW). Now suppose you told the assembler to lda $001E or something (let's assume for simplicity that you're in bank $00). In WLA, since that address fits in 8 bits, it would probably assemble into "A5 1E", and instead of loading the value from $00001E as desired, the system would try to read $00:(D+$1E) = $00211E, which is a Mode 7 matrix register and also write-only, and the result would be open bus (ie: garbage).

Now, that case would work fine if you had D set to $0000. But suppose you had the data bank register set to $5A? Now you're trying to access a HiROM bank, so there's no RAM mirror there and the data at $001E will be different from the same address in bank $00. Even with D at zero, you're going to get $00001E instead of $5A001E, because direct-page doesn't care what the data bank is.

Quote:

The only manual input I remember is when it told me that bra didn't work because the number was too high or something, so I switched to brl and it worked fine. (bra is 8 bit, while brl is 16 bit?)

That's a lot like DP in some ways, except that since the opcode names aren't the same it's impossible for the assembler to bork it up.

A closer comparison to direct-page vs. absolute might be between bra and jmp, since one is relative to a given position within the program bank and the other is not.

Quote:

Also, I guess it would probably be good to learn about direct paging and stuff?

Yes. At the very least you need to be able to keep the assembler from doing weird things. And once you understand this stuff, you can exploit it for speed and convenience...

Re: 16bit table indexing problem
by koitsu on 2015-01-21 (#139760)

Well damn, I had a huge and well-written-out post that compounded 93143's post, but then I went messing about trying to find old posts to reference and ended up losing my stuff by accidentally clicking "Edit" on someone's post. Doh. :(

One thing I did want to mention here: I've tried to figure out (per the docs) how WLA DX decides whether to use direct page or absolute addressing opcodes (more specifically: if it tracks tcd like Tepples asked, or if there's an explicit syntax modifier for forcing one or the other (common in IIGS assemblers)), and it isn't mentioned in the docs anywhere that I can find, nor in the examples. I read the Assembler Syntax and 65816 sections and neither state anything useful (sigh). I did find this, however, describing .struct:

Code:

A WORD OF WARNING: Don't use labels b, B, w and W inside a struct as e.g.,
WLA sees enemy.b as a byte sized reference to enemy. All other labels should
be safe.

lda enemy1.b  ; load a byte from zeropage address enemy1 or from the address
              ; of enemy1.b??? i can't tell you, and WLA can't tell you...

This is a great example of WLA DX's horrible documentation. Honestly I've been sitting here for a good 5 full minutes trying to work out exactly what the author is trying to convey. It's specifically referring to 6502 here, but seriously, my brain is a circular bunch of mush:

I start thinking: "okay, he means that within a .struct block, you should not use labels named b or w or else the parser won't know if you're..... wait, no, that makes no sense: .b and .w are used to specify the "size" of an address or immediate value at expansion-time, what does that have to do with the actual label itself?"

But then I read the code comment and I think "oh wait, I see, lda enemy1.b (which should syntactically be the same as lda.b enemy1) doesn't allow the assembler to determine if you want to expand enemy1 into an 8-bit address for zero page access, or refer directly to the (presumably) 16-bit address of label enemy1.b... except what does that have to do with structs and b and w? There's no . (period/dot), so what the heck is going on? What is the parser doing?!"

My gut feeling is that in 65816 mode, it probably uses absolute addresses all the time unless specifically told not to... except for the beautiful bug that psychopathicteen pointed out and I expanded upon, where the assembler is ridiculously choosing to start referring to 8-bit addresses in direct/zero page for no reason. (I haven't had time to look into that, but I still have those couple of gut feelings...)

The only thing I could find in the WLA DX docs about that assemble-time decision is with regards to using the .b modifier, or the .8bit directive:

Code:

For example:

LSR 11       ; $46 $0B
LSR $A000  ; $4E $00 $A0

The first one could also be

LSR 11       ; $4E $0B $00

.8BIT is here to help WLA to decide to choose which one of the opcodes it
selects. When you give .8BIT (default) no 8bit address/value is expanded
to 16bits.

By default WLA uses the smallest possible size. This is true also when WLA
finds a computation it can't solve right away. WLA assumes the result will
be inside the smallest possible bounds, which depends on the type of the
mnemonic.

The last paragraph is awfully damning, but the phrase "can't solve right away" makes my brain explode. Right away? So, what, it can figure it out later? What does that even mean?

With regards to the bug psychopathicteen pointed out, it's literally like the assembler's internal logic is somehow suddenly deciding to use direct page/zero page addressing when it most definitely shouldn't. But that last paragraph implies that some kind of "computational mistake" that caused it to do this, and the code (to me) doesn't give any indication that the assembler would have any difficulty.

I even bothered to check in the IIGS mini-assembler writing the same code -- and it does the right thing.

Re: 16bit table indexing problem
by Sik on 2015-01-22 (#139765)

koitsu wrote:

The last paragraph is awfully damning, but the phrase "can't solve right away" makes my brain explode. Right away? So, what, it can figure it out later? What does that even mean?

Taking a guess, it probably means an expression that can't be solved in the first pass but can be solved in the second (obvious example: referring to a label that's defined later in the code - in the first pass the assembler won't have seen it yet, but by the second pass it will have). This can actually be quite of a problem if something has to be guessed since usually assemblers determine the opcode to use in the first pass (so they can know the addresses of all instructions by the second pass).

Why the heck it decides that the smallest size possible (i.e. the most prone to break) should be the default in case of guessing, I don't know. Practically every time I see an assembler stumble upon a situation like this, they either default to the largest size (unoptimal but bound to work always), choose a default size and then throw an error if it doesn't work (e.g. 68000 defaulting to word opcodes), or just throw an error immediately. They don't let it pass if it can break.

Re: 16bit table indexing problem
by psycopathicteen on 2015-01-23 (#139797)

You might as well download bass.exe, since it's the most current assembler.

Re: 16bit table indexing problem
by tomaitheous on 2015-01-23 (#139799)

tepples wrote:

A "bank" is 65536 bytes. "Direct page" is the ability to relocate the 65816's counterpart to the 6502's zero page anywhere in bank $00. Direct page addressing modes end up behaving more like a frame pointer (like EBP on x86), even if 6502 fans set it to $0000 for familiarity.

I think I remember reading that if you set the DP register to anything other than default, that there would be a +1 cycle penalty for all ZP addressing modes.

Re: 16bit table indexing problem
by KungFuFurby on 2015-01-23 (#139800)

I think I recall that the penalty only applies if the direct page register uses a non-zero value for the low byte (high byte can be any value and no penalty will apply).

If you're looking for the latest version of bass (I think it's v14), I have a copy of v14 on my end (normally, it's an .xz file, but I can make a zip version of it).

Re: 16bit table indexing problem
by koitsu on 2015-01-23 (#139805)

Re: D register and cycle penalties: covered earlier by 93143 (last paragraph). And yes, the +1 cycle penalty only applies if the low byte of register D is non-zero, e.g. lda #$1e20 / tcd would induce a +1 cycle penalty for all subsequent DP access, while lda #$1e00 / tcd wouldn't.

Re: 16bit table indexing problem
by tomaitheous on 2015-01-23 (#139807)

Ahh, that's what I remember: if DP register was anything other than '0', then apply said penalty. The document I had must have not bothered to mention this was strictly the case of the low byte of DP register.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-25 (#139905)

I know I'm bumping again, (I have had a break from SNES development because WLA has led me a bit frustrated with its bugs like hirom not working correctly and the newer problem) and I decided to try another assembler. I found a download for the bass assembler, (after looking ay a bunch of SNES bass fishing games and bass from mega man :roll:

) but I haven't found how to use it, because I don't really know what to do here:

Attachment:

bass command line.png [ 7.67 KiB | Viewed 6253 times ]

I typed in bass and then the name of the file, but it seemed to disregard what I wrote. I know I'm also probably going to have to change my code to make it work on bass, but I didn't see any documentation on the assembler so I have no clue what to do.

Re: 16bit table indexing problem
by tokumaru on 2015-01-25 (#139906)

Espozo wrote:

I typed in bass and then the name of the file, but it seemed to disregard what I wrote.

Read the "usage" section: apparently there are a bunch of optional parameters, and then you have to use -o followed by the name of the output file, and finally the input files.

Re: 16bit table indexing problem
by koitsu on 2015-01-25 (#139907)

metaspritetest2, by the way, is not the name of the file. metaspritetest2.asm is the name of the file. The wla.bat file you've been using "hides" use of extensions by (blindly/idiotically) appending .asm to the argument you pass the script.

bass -o metaspritedemo2.bin metaspritedemo2.asm is probably what you're looking for here, but I also don't know for sure because I'd have to read the assembler documentation. There is documentation that comes with it (the UNIX tarball has doc/bass.html).

I'd honestly suggest just asking someone like Tepples to write you a ca65 template and use that instead.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-25 (#139908)

By template, I'm guessing you mean something like the walker code in SNES Starter kit? It would probably be easier to start from scratch, seeing that I didn't do too much... Also, does anyone know where to download ca65? I've looked for one, but instead I found a bunch of downloads for something called cc65.

Re: 16bit table indexing problem
by DoNotWant on 2015-01-25 (#139909)

cc65 is the tool-chain with c compiler and all. The assembler comes with it.
Bass is easier to use tho.
http://board.byuu.org/phpbb3/viewforum.php?f=16

Re: 16bit table indexing problem
by tepples on 2015-01-26 (#139946)

koitsu wrote:

I'd honestly suggest just asking someone like Tepples to write you a ca65 template and use that instead.

Did someone say template?

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-26 (#139950)

What am I supposed to do now? I looked at all the different files, but none of them had the source code. I opened README.md, and it looks like it makes me want to get a bunch of stuff to compile something to make the file. Why doesn't it just come together already? (I remember someone giving me an explanation, I just forgot.)

Quote:

Building this demo requires that the following software be
installed on your computer first:

* ca65 and ld65, the assembly language tools that ship
with the cc65 C compiler
* Python, a programming language interpreter
* Python Imaging Library, a Python extension to read and write
bitmap images
* GNU Make, a program to calculate which files need to be
rebuilt when other files change
* GNU Coreutils, a set of simple command-line utilities for
file management and text processing

I know that ca65 is an assembler.

By the way, what in the world is a PKCS #7 Certificate?

Re: 16bit table indexing problem
by nicklausw on 2015-01-26 (#139958)

I still don't get why it'd be necessary to install all of that just to have a good SNES environment. A makefile for the SNES is....eh.

Re: 16bit table indexing problem
by tepples on 2015-01-26 (#139960)

Espozo wrote:

What am I supposed to do now? I looked at all the different files, but none of them had the source code.

Did you look in the src folder?

Quote:

I opened README.md, and it looks like it makes me want to get a bunch of stuff to compile something to make the file. Why doesn't it just come together already?

Because I don't know what operating system people run before they download my template. Nor do I want to have to include copies of image and audio conversion tools for Windows (32-bit), Windows (64-bit), OS X, Linux (32-bit), and Linux (64-bit), 80 percent of which will be useless to each individual user and building 20 percent of which requires a particular brand of computer.

Quote:

By the way, what in the world is a PKCS #7 Certificate?

The .spc file extension can refer to either SPC700 save states (used for Super NES music) or software publisher certificates. "Public key cryptosystem" refers to the latter.

Quote:

I still don't get why it'd be necessary to install all of that just to have a good SNES environment. A makefile for the SNES is....eh.

If you've made a sprite sheet, you need tools to convert it to a format that the S-PPU can use. If you've recorded audio, you need tools to convert it to a format that the DSP can use. So you need at least some tools other than an assembler.

And if you're building something with hundreds of source code files, sprite sheets, musical instrument samples, etc., you usually don't want to have to reconvert every single image file and every single audio file and reassemble every single source code file whenever you make one small change. A makefile allows rebuilding only what has changed. Besides, a lot of that stuff already comes standard on Debian-style distributions if you sudo apt-get install build-essential python-imaging.

Re: 16bit table indexing problem
by koitsu on 2015-01-26 (#139963)

nicklausw wrote:

I still don't get why it'd be necessary to install all of that just to have a good SNES environment. A makefile for the SNES is....eh.

Simply put: you don't. Tepples' recommendations stem from his own development of tools and other things that tend to "relate" to SNES development (graphics creation/conversion, etc.). Tepples is kind of a interesting fellow because he tends to use large toolsuites (Python, GNU tools on Windows which often require Cygwin or MinGW, etc.) and requires non-bare-bones PLs for development, with a strong Linux/UNIX mindset (the command-line there blows Windows out of the water), yet the platform being developed for is extremely bare-bones. There's nothing wrong with his approach, but it's a method/approach that doesn't work for me.

I happen to be of a different mentality (and from a different era), mainly KISS principle: all you need is the assembler, some documentation, and some general pre-made tools. You don't need tons of disk space for all of this stuff, nor do you need to "install" anything (most KISS/bare-bones tools are self-contained). Most of my SNES coding was done with a single x816.exe binary and only a couple random tools (single .exes) available at the time. That's how it should be, IMO. The way Espozo is already operating is under this mindset (single .rar file contains all his tools, code, etc.), so IMO he should continue with that.

For ca65, you just simply need to download cc65. The only tool you really need from that suite is ca65 (the assembler), and extract it into some place like C:\cc65 and fix your PATH (you can do that from the Windows GUI, although a lot of people do this wrong/put things in the wrong place) to refer to that directory. Or you can just extract ca65.exe (assembler) and ld65.exe (linker) from the archive and use those directly, though if the templates or some other code refer to some special stuff cc65 includes (cfgs, libs, etc.) then you'd need those too. I do love the fact cc65 doesn't have an installer -- I LOVE programs that you just simply unpack somewhere and use them. It also makes clean-up much easier (delete the directory and they're gone).

GNU Make etc. are nice for creating an actual build command that works, e.g. you just type "make" after doing your code changes and everything builds. It's like a Windows .bat file but significantly improved (batch is horrible). On the downside, GNU tools tend to require "special environments" to work on Windows, and the two are either MinGW/MSYS or Cygwin, neither of which I particularly like. There may be purely standalone gmake.exe somewhere but I'd be surprised. You can accomplish the same task in a Windows batch file, but it's just more painful (and batch lacks things like dependency checking).

I think Espozo is learning just how completely awful the tool and development situation is these days, both on classic consoles and even present-day systems, and I feel really sorry for him. :( Something very very bad happened beginning in the early 2000s where morons getting into CS started pushing out awful designs and crappy tools simply because "open source!" and "WHEEEEEE UNICORNS!" and whatever other madness, while older folks from previous eras stuck with extremely minimal sets of tools and environments that they could practically carry around with them on a floppy disk in their pocket. It's almost like the driving force is glitter and "how fast can I push this out + how quickly can I get distracted by the latest new feature or thing rather than actually fix bugs", rather than creating something of actual use. The number of crazies in "development" has grown to a gargantuan size, and it's awfully depressing how few of those people actually understand how a computer (or even the underlying libraries their own PL relies on) actually works. I'm really sure those are the type of people I want developing or designing the tools *I* need, but I'm also the crotchety old man that sticks to what works because I understand how it works.

Don't worry, I'm sure someone will come along and make a SNES assembler written in Ruby and put it on rubygems named something pretentious like chariots_of_fire or super_duper_builder or trampoline_hexify, all very well-thought-out names. *rolls eyes*

Re: 16bit table indexing problem
by tepples on 2015-01-26 (#139966)

My mentality is that once you learn what the tools are doing, they'll scale up to bigger projects. I understand some people prefer minimalism, where the whole thing gets reassembled every time. I used to do that in x816 until I upgraded to an operating system that could no longer run it reliably.

There is a standalone gmake.exe; it's called mingw32-make.exe. It depends only on the MSVC 6 runtime. Sometime in 2005, I ended up making my own distribution of portable cc65 for Windows to use on computers in a college computer lab; I forget whether I used mingw32-make or just batch files. It became less important to me after I got a laptop on which I carry my entire dev environment.

But even then, you still need to turn your graphics, audio, and other assets (components of a game other than code) into a form that your program can use. What's the minimalist way to go about this? Tile editor? Waiting until you get back to your home computer to work on graphics and audio?

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-26 (#139974)

Quote:

I happen to be of a different mentality (and from a different era), mainly KISS principle: all you need is the assembler, some documentation, and some general pre-made tools. You don't need tons of disk space for all of this stuff, nor do you need to "install" anything (most KISS/bare-bones tools are self-contained).

It really kind of surprises me how people can have 1TB hard drives and still ask for more room, while my hard drive looks like this:

Attachment:

Disk Space.png [ 4.15 KiB | Viewed 6153 times ]

And about downloading an installer, I really don't see the point. To me, it seems like just downloading something twice (the installer and the actual file) for no real reason. Whenever I have the option of a torrent or a direct download, I always just use the direct download.

Quote:

It's almost like the driving force is glitter and "how fast can I push this out + how quickly can I get distracted by the latest new feature or thing rather than actually fix bugs"

I'm guessing this is referring to wla. (Seriously, the bug that's been causing me problem on XTemp and YTemp present in the version released in 2003 still hasn't been fixed. :roll:

)

Also, I still don't see how people can "code" in high level languages, because it seems like there is a lot of different instructions and things to keep track of. I've been looking at the GBA, and it looked interesting to try to make something on it when I saw how little, but useful, amount of instructions there are, until I found out that there are a ton of different little things that you can add to the end of each opcode for way more actual different opcodes than it woud seem. For example, I looked at a code and it said subs. What is that supposed to mean? I replaced the subs with sub and it worked perfectly fine, so I don't get it. I find it funny that the GBA is supposed to use a Reduced-Instruction-Set CPU, when it has about a zillion possibilities for you to do. (I don't even want to know what assembly looks like on a CISC CPU.)

I really hope that some day, I won't have to ask a bunch of questions here, but I don't see myself, or anyone for that matter, really learning ASM anymore, especially for the SNES. There is a programming class at my school, but you first have to go though another class that has nothing to do with it, as you just do dumb stuff like create business cards. I'm 99.9999% sure that the "programing" class doesn't even cover any asm, or any non high level languages for that matter. I guaranty that there will be people who finish that class with a 100 average and still not even know what a byte is. (And this is no exaggeration.) Seriously, many peoples lack of any computer knowledge astonishes me, and I sadly think it will only worsen over time. A kid at my school literally thought that I phones (which I despise) became slower over time because the processor started to "rot" or something. Some people at my school didn't even know what a CPU was! :shock:

Re: 16bit table indexing problem
by tepples on 2015-01-26 (#139976)

Espozo wrote:

I've been looking at the GBA, and it looked interesting to try to make something on it when I saw how little, but useful, amount of instructions there are, until I found out that there are a ton of different little things that you can add to the end of each opcode for way more actual different opcodes than it woud seem. For example, I looked at a code and it said subs. What is that supposed to mean? I replaced the subs with sub and it worked perfectly fine, so I don't get it.

subs: subtract and set flags
sub: subtract and don't set flags

Quote:

I find it funny that the GBA is supposed to use a Reduced-Instruction-Set CPU, when it has about a zillion possibilities for you to do.

Reduced instruction set complexity (RISC) largely refers to how "orthogonal" the instruction set is. You can add the same things "to the end of each opcode" no matter what the opcode is. You don't end up with crap like shifting only working with one register and indexing only being allowed with another. (See "Orthogonality" on Wikipedia and Stack Overflow.)

Quote:

(I don't even want to know what assembly looks like on a CISC CPU.)

Among CISC CPUs, the 68000 in the Genesis, Neo Geo, Amiga, and original Mac is pretty sane. If you want insane, look at recent x86.

Quote:

A kid at my school literally thought that I phones (which I despise) became slower over time because the processor started to "rot" or something.

It's not the processor that rots; it's the storage. As the internal flash fills up, the memory controller has to work harder to spread out the writes to different flash cells so that one part of the flash doesn't rot prematurely. Also new versions of the OS can be more demanding of RAM, causing the memory manager to have to kick things out more often.

Re: 16bit table indexing problem
by nicklausw on 2015-01-26 (#139977)

http://www.network54.com/Forum/56397/page-9 Looks like for a while Ville was just continuously releasing new versions of WLA blindly. Took some time for the updates to get heavy and for people to notice them (and for him to stop scrambling around compiling it on a bunch of OS's and just make it open source).

And now the source code is a gigantic mess. What a lovely turnout.

Re: 16bit table indexing problem
by thefox on 2015-01-27 (#139993)

tepples wrote:

There is a standalone gmake.exe; it's called mingw32-make.exe. It depends only on the MSVC 6 runtime.

True, but the problem is that most Makefiles depend on the standard *nix toolset. Many of those tools are natively available with GnuWin32, but that doesn't take care of the many incompatibilities with cmd.exe (e.g. MKDIR is a built-in command in cmd.exe). These differences can be compensated for in the Makefile, but that's not really what you want to be doing when creating a build system.

Personally I prefer using a "Makefile generator" like CMake, because CMake is able to take care of the platform differences. On top of that, CMake has builtin tools for writing and running tests, and packaging the build results.

Re: 16bit table indexing problem
by tepples on 2015-01-27 (#139995)

thefox wrote:

Personally I prefer using a "Makefile generator" like CMake, because CMake is able to take care of the platform differences. On top of that, CMake has builtin tools for writing and running tests, and packaging the build results.

But then you have to include CMake in the list of dependencies you have to download.

Re: 16bit table indexing problem
by Sik on 2015-01-27 (#139996)

How is having MinGW as a dependency any better? (given it's only being used for mingw32-make)

Re: 16bit table indexing problem
by thefox on 2015-01-27 (#139997)

tepples wrote:

thefox wrote:

Personally I prefer using a "Makefile generator" like CMake, because CMake is able to take care of the platform differences. On top of that, CMake has builtin tools for writing and running tests, and packaging the build results.

But then you have to include CMake in the list of dependencies you have to download.

If your opinion is that the disadvantage of adding one more dependency (possibly removing a bunch of other ones at the same time) balances out the advantages of platform compatibility, and ease of writing the build scripts, and the n+1 other useful features provided by CMake, then certainly that's a valid complaint.

BTW, I'm not saying CMake is perfect. I've used it enough to know it has a number of problems, but I would choose it over handwritten makefiles every time. But to each his own.

Re: 16bit table indexing problem
by koitsu on 2015-01-31 (#140215)

tepples wrote:

koitsu wrote:

I'd honestly suggest just asking someone like Tepples to write you a ca65 template and use that instead.

Did someone say template?

I finally got around to looking at this code. Ho-ly-shit. Really? God dude, I don't even know where to begin. The code itself is fine (sort of -- very bad init routine from the look of it), but it's nearly impossible to follow given formatting, things in files that don't make any sense (like why is the reset vector code in snesheader.s), and a ton of other things. Can you honestly read this given the formatting and almost "inline" comments without any actual structure?

I at least got part of Espozo's code assembling (writing the Windows batch file was about 70% of the work), but I'm having to go through one thing at a time and it's very very painful. I had no idea ca65 would be this... I don't know... powerful yet absolutely and totally ridiculous. There's even some WLA DX code (a macro) that makes zero sense to me at this time and not having a listing file from WLA DX makes me sit here going "how the hell does this even work?"

What the hell happened to SNES development? How is it we had more or less better assemblers and sane tools than now? Wow.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-31 (#140225)

I felt the same way you do, I just didn't want to say anything. I think tepples brain is wired differently from ours. :roll:

I don't think my work environment is way too complex.

Edit: What follows is my work environment, which I think is easier to understand. I originally forgot to mention this, so my post made no sense.

Attachment:

Work Environment.png [ 2.93 KiB | Viewed 15702 times ]

In this, 2input does something with the controllers, (I honestly don't know, but it works, so...) Header is the header (no duh) InitSNES2 just sets all the registers back to 0, LoadGraphics is a macro that makes it easy to DMA graphics to vram, Metasprite2 is a metasprite creating routine, Metasprite Test2 is the main file, (I always specify this) and Sprites sets all the sprites off screen.

Re: 16bit table indexing problem
by koitsu on 2015-01-31 (#140266)

My comment was with regards to tepples' "template" ca65 example.

The one piece of WLA DX code I cannot figure out is this:

Code:

;============================================================================
; LoadBlockToVRAM -- Macro that simplifies calling LoadVRAM to copy data to VRAM
;----------------------------------------------------------------------------
; In: SRC_ADDR -- 24 bit address of source data
;     DEST -- VRAM address to write to (WORD address!!)
;     SIZE -- number of BYTEs to copy
;----------------------------------------------------------------------------
; Out: None
;----------------------------------------------------------------------------
; Modifies: A, X, Y
;----------------------------------------------------------------------------

;LoadBlockToVRAM SRC_ADDRESS, DEST, SIZE
;   requires:  mem/A = 8 bit, X/Y = 16 bit
.MACRO LoadBlockToVRAM
    lda #$80
    sta $2115
    ldx #\2         ; DEST
    stx $2116       ; $2116: Word address for accessing VRAM.
    lda #:\1        ; SRCBANK
    ldx #\1+\4      ; SRCOFFSET
    ldy #\3         ; SIZE
    jsr LoadVRAM
.ENDM

What is \4 here? It refers to the 4th argument to the macro, but the comment preceding the macro doesn't mention it. The code clearly uses it (note 2nd SpriteTiles load; it's the only call where it's non-zero)

Code:

        LoadBlockToVRAM SpriteTiles, $0000, $0040, $0000
        LoadBlockToVRAM SpriteTiles, $100, $0040, $0040
        LoadBlockToVRAM BackgroundPics, $2000, $3620, $0000     ; 384 tiles * (8bit color)= 0x6000 bytes
        LoadBlockToVRAM BackgroundMap, $7000, $1000, $0000      ; 64x64 tiles = 4096 words = 8192 bytes

I fully understand what LoadVRAM does -- the contents of X make it into $4302, which is the 16-bit portion of the 24-bit address that DMA channel #0 is going to read from when populating VRAM, but I do not understand why the logic in the macro is to add argument 1 and argument 4 together to make up the 16-bit base address of where the source data is. Argument 1 = SRC_ADDR, which should be a full 24-bit address (according to the comments).

The WLA DX docs, as I expected, don't shed any light on this either.

Code:

Also, if you want to use macro arguments in e.g., calculation, you can
type '\X' where X is the number of the argument. Another way to refer
to the arguments is to use their names given in the definition of the
macro (see the examples for this).

To me, ldx #\1+\4 when doing SpriteTiles, $100, $0040, $0040 (assume SpriteTiles full 24-bit address is $02f3f0 would result in ldx #($20f3f0 + $0040) (the bank byte is effectively stripped), thus ldx #$f430. What I don't understand is what the purpose of the 4th argument actually is. If I had seen ldx #\1 I might think "the lower 16-bits of the 1st argument", but again the WLA DX docs don't shed any light on this, going back to the need for a listing generation.

I think it's used as an "additional offset" within whatever you provide in argument 1, i.e. SpriteTiles+$0040, but the fact the macro doesn't properly document the use of the 4th argument worries me.

Edit: and your code isn't very well organised either. :-) But it's at least something I can follow a lot easier than the ca65 lorom example.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-31 (#140267)

Oops, I was saying that I think he should set his code up to be a little more organized like mine, but I forgot to say anything about that and just said how my code was set up. :oops:

Quote:

I think it's used as an "additional offset" within whatever you provide in argument 1, i.e. SpriteTiles+$0040, but the fact the macro doesn't properly document the use of the 4th argument worries me.

Correct! I was tired of making 3,000 different pictures for a single frame and whatnot, so I just made that. Pretty nifty, don't you think? :wink:

(I'll use comments on it next time I upload a file.)

Quote:

Edit: and your code isn't very well organised either. :-)

But it's at least something I can follow a lot easier than the ca65 lorom example.

Do you know what I could do better? I upload stuff enough, so I just want to have it as organized as possible so I don't frustrate people.

Re: 16bit table indexing problem
by tepples on 2015-01-31 (#140271)

koitsu wrote:

My comment was with regards to tepples' "template" ca65 example.

I want to make it less "Ho Lee Fuk". I'd appreciate your thoughts here.

Re: 16bit table indexing problem
by DoNotWant on 2015-01-31 (#140273)

koitsu wrote:

Code:

;============================================================================
; LoadBlockToVRAM -- Macro that simplifies calling LoadVRAM to copy data to VRAM
;----------------------------------------------------------------------------
; In: SRC_ADDR -- 24 bit address of source data
;     DEST -- VRAM address to write to (WORD address!!)
;     SIZE -- number of BYTEs to copy
;----------------------------------------------------------------------------
; Out: None
;----------------------------------------------------------------------------
; Modifies: A, X, Y
;----------------------------------------------------------------------------

;LoadBlockToVRAM SRC_ADDRESS, DEST, SIZE
;   requires:  mem/A = 8 bit, X/Y = 16 bit
.MACRO LoadBlockToVRAM
    lda #$80
    sta $2115
    ldx #\2         ; DEST
    stx $2116       ; $2116: Word address for accessing VRAM.
    lda #:\1        ; SRCBANK
    ldx #\1+\4      ; SRCOFFSET
    ldy #\3         ; SIZE
    jsr LoadVRAM
.ENDM

Where did you get this code? With WLA DX? I can't find it with either 9.2 or 9.4.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-31 (#140277)

It's from the LoadGraphics code on the SNES starterkit. Well, mostly. I took some stuff bazz did to the palette uploading macro (it was truly useless beforehand) and I added a 4th argument on the LoadBlockToVram that serves as an offset in the picture.

SNES starterkit here:

http://wiki.superfamicom.org/snes/show/ ... nvironment

Custom LoadGraphics code here:

Attachment:

LoadGraphics.asm [4.86 KiB]
Downloaded 610 times

Re: 16bit table indexing problem
by koitsu on 2015-01-31 (#140280)

tepples wrote:

koitsu wrote:

My comment was with regards to tepples' "template" ca65 example.

I want to make it less "Ho Lee Fuk". I'd appreciate your thoughts here.

Are you available online for real-time chat somewhere? I need to talk to you about some syntax errors with macros I'm getting in ca65 which aren't making any sense to me.

Re: 16bit table indexing problem
by DoNotWant on 2015-01-31 (#140282)

Espozo wrote:

It's from the LoadGraphics code on the SNES starterkit. Well, mostly. I took some stuff bazz did to the palette uploading macro (it was truly useless beforehand) and I added a 4th argument on the LoadBlockToVram that serves as an offset in the picture.

SNES starterkit here:

http://wiki.superfamicom.org/snes/show/ ... nvironment

Custom LoadGraphics code here:

Attachment:

LoadGraphics.asm

Oh, you added that. Yeah, then I know where it is from, thanks.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-31 (#140287)

Don't mention it! :twisted:

Re: 16bit table indexing problem
by koitsu on 2015-01-31 (#140288)

Well shit, I'll just post it here. Maybe thefox or someone else knows.

Code:

Main.asm(266): Error: `:' expected
Main.asm(266): Error: Unexpected trailing garbage characters

This lines correlates with the following line that uses a macro (and I'll include the segment it's in, etc.). This is in the CODE segment:

Code:

LoadPalette BG_Palette, 0, 16

BG_Palette comes from the same file, but in segment RODATA3:

Code:

BG_Palette:
  .incbin "GamePictures\hovertransport.clr"

And finally the macro, which comes from a different file called Macros.asm and is assembled first (NOTE: ca65 assembles this just fine):

Code:

.macro LoadPalette src, start, size
.if .PARAMCOUNT <> 3
.error "LoadPalette: incorrect number of parameters"
.endif
  lda #.LOBYTE(start)
  sta $2121                  ; Start at START color
  lda #.BANKBYTE(src)        ; Bank byte of src (upper 8-bits of 24-bit address)
  ldx #.LOWORD(src)          ; Address of src   (lower 16-bits of 24-bit address)
  ldy #(size*2)              ; 2 bytes for every color
  jsr DMAPalette             ; see main.asm
.endmacro

DMAPalette comes from Main.asm and is just a .proc within the CODE segment.

The ca65 error, given the complaint about a colon, seems to imply it's having issues comprehending labels.

Re: 16bit table indexing problem
by thefox on 2015-01-31 (#140290)

koitsu wrote:

The ca65 error, given the complaint about a colon, seems to imply it's having issues comprehending labels.

I'd guess that it doesn't know LoadPalette is a macro, and then assumes it must be a label, and then craps out because there's no ":". Is the macro included into the file, or assembled separately? If assembled separately, it won't be visible in other modules.

Re: 16bit table indexing problem
by koitsu on 2015-01-31 (#140291)

thefox wrote:

I'd guess that it doesn't know LoadPalette is a macro, and then assumes it must be a label, and then craps out because there's no ":". Is the macro included into the file, or assembled separately? If assembled separately, it won't be visible in other modules.

It's assembled separately, because I figured the assembler would be smart enough to deal with that. Obviously I need to use .include instead. Yup, that fixed it. *sigh* Thanks.

Re: 16bit table indexing problem
by koitsu on 2015-01-31 (#140303)

Code comment in passing: this 2input.asm file is utter shit. I seriously want to punch whoever decided to name the equates and the variable names the same thing (JOY1 vs. Joy1). This is just awful.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-31 (#140304)

Thank bazz!

I have the original here, if this is any better:

Attachment:

InitSNES.asm [7.25 KiB]
Downloaded 311 times

Re: 16bit table indexing problem
by koitsu on 2015-01-31 (#140305)

Here you go. To build, just run build. You don't need to specify any arguments; just run "build" from within the directory where everything is. You will find that assembly listings for most of the files are generated (.lst files). map.txt will contain a "map" of what memory (ROM) vs. file offset locations contain what "segments" of stuff (I found this extremely useful, as did tepples apparently). Built object files (which the linker put together) are stored in the dir objs\.

You can some of the .proc routines into their own files if you wish, then make appropriate changes to the build.bat script.

Two things to note, bugs which need to be fixed:

1. Sprites do not appear. Unsure why, did not investigate.
2. "Main ship" location appears incorrectly offset somehow; maybe BG X/Y offset issue? Unsure why, did not investigate.

Hope someone can delve into those + figure them out. It's probably something obscure. This was a "fun" learning experience in the sense that 80% of my time I spent cursing loudly. "Fun"...

I'm bowing out of this thread entirely at this point. I've helped out as much as I think I can bear at this point.

Footnote: I'm kinda disgusted how ca65 has no real concept of 65816 direct page (only 6502 zero page). The best you can do is declare zero page in $0000-00ff and then the rest as BSS. I kept rolling my eyes over this. But hey, at least the assembler can generate sane listings. :-)

Re: 16bit table indexing problem
by Drew Sebastino on 2015-01-31 (#140306)

Thank you!

(Seriously, It got to the point where I just wrote out registers. At that point, I might as well have just used machine code. :roll:

)

koitsu wrote:

1. Sprites do not appear. Unsure why, did not investigate.2. "Main ship" location appears incorrectly offset somehow; maybe BG X/Y offset issue? Unsure why, did not investigate.

I think this has something to do with it...

Attachment:

Oam....png [ 30.59 KiB | Viewed 6787 times ]

(Why it's acting like this is beyond me.)

I want to try and figure out what's up, which leads me to my main question. What is the main file? Is it Main, or is it MetaspriteTest2, because they both have a lot of the same code.

Edit: Oh, wait, I'm stupid. Main is just saying what is going on and is not actually assembled. (I need to pay attention more often. :oops:

)

Re: 16bit table indexing problem
by Drew Sebastino on 2015-02-01 (#140340)

After poking at the code for two more seconds, I found the reason the BG was off:

Attachment:

BG scroll.png [ 31.92 KiB | Viewed 6752 times ]

I had thought it would be a problem of where the map data was, but I made it to where you can move the camera vertically to and the full ship is in tact. I wonder how that happened.

One thing I find peculiar about it is that BG1 vertical position (not BG0) has all 16 bits set, while the x position doesn't. (I guess there's a whole bunch of junk before all that doesn't effect anything, but it just stops right there) I still don't know what's up with the sprites, but moving around the d-pad doesn't effect them, which means it probably doesn't have to do with the metasprite routine, but the routine at the begging that puts them all off screen.

Quote:

Edit: Oh, wait, I'm stupid. Main is just saying what is going on and is not actually assembled. (I need to pay attention more often. :oops:

)

Apparently, I'm even dumber than I thought, because metaspritetest2 doesn't actually do anything. main does.

Re: 16bit table indexing problem
by koitsu on 2015-02-01 (#140352)

Sorry, I thought I deleted all those damn files before I made the .rar. Delete MetaspriteTest2.asm -- it does not get used as you've found.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-02-01 (#140361)

A quick little discovery... I found that not transferring MapX and MapY to the BG scrolling register causes the BG to act normally, so there must be a problem with MapX and MapY. Did you move all the "defined" registers from their original positions? I guess MapX and Y already have a value loaded in them from something else, but it only gets loaded before the infinite loop, because it still scrolls like normally. Otherwise, the BG wouldn't move.

Oh, and another thing. Oam is completely filled with #$55, which is kind of odd... (I made it to where I completely skipped the Medasprite routine and the SpriteInit, and nothing changed.) I got rid of the Oam uploading code, and it acted normally if you did that, having everything be #$00. I then though of the similarities between the BGs and sprites on my code and remembered that both are transferring data in NMI, so maybe the problem resides there? If someone, besides koitsu, would mind helping me, I would greatly appreciate it. :oops:

Re: 16bit table indexing problem
by koitsu on 2015-02-02 (#140428)

I can try to take a look at it later tonight, assuming I don't want to murder someone after work (Ruby code pisses me off). I don't think anyone else will step up to the plate.

I would not be surprised if I didn't "port" something over correctly or made a mistake with a minor change somewhere -- the latter is likely, because I remember when first doing this, I had the background up on the screen correctly but no sprites.

Kudos to you for giving the debugging a shot and narrowing it down possibly to the NMI routine. *thumbs up* :-)

Re: 16bit table indexing problem
by Drew Sebastino on 2015-02-02 (#140447)

koitsu wrote:

Kudos to you for giving the debugging a shot and narrowing it down possibly to the NMI routine. *thumbs up* :-)

Well, I was "sort of" right. The data that was getting uploaded to OAM was not the right place because Spritebuff1 was no longer a $0400, (it's now at $0200) so I changed any reference using $0400 to #Spritebuff1. The main problem here, however, is that for some reason, RAM is flooded with #$55 in a lot of places, including MapX and Y, and SpriteBuf1. I stored 0 is MapX and MapX+1 and also did the same thing for Y and the BG is now fixed. For the sprites, I made a loop where SpriteBuf1 is filled with 0's. (There is a 1 in the X position of each sprite to prevent a special bug or something.) This now causes a sprite to be on screen when you mess with SpriteBuf2, but it appears the metasprite routine isn't working properly, because everything appears in the upper left hand corner. You said it was messed up before hand, so switching the code with one that works (which I have since made) should fix it. My problem with this whole thing is knowing why wram is flooded with #$55 in many areas, even if I got it to work.

Attachment:

Sort of Fixed.rar [266.2 KiB]
Downloaded 360 times

Re: 16bit table indexing problem
by koitsu on 2015-02-02 (#140461)

Emulator may possibly pre-fill an entire area with certain values, hard to say. Easy enough to check...

I'm going to go back "from scratch" (sort of -- I'm going to use some of the stuff I gave you, but I'm going to re-do the porting of all the code and touch/change literally none of it, and only do little bits at a time, that way I can do a binary ROM comparison between the two and see what's changed (if some assembler mistakes, etc.)) and see if I can get this done.

I only got done with work a couple hours ago and since have been interrupted by too many people unrelated to all this, keeping me from helping. Grmf...

Re: 16bit table indexing problem
by koitsu on 2015-02-03 (#140476)

Figured out root cause of lots of things (incl. BG offset being wrong). Root cause was pretty funny (I missed 1 single line of code). I'll have a whole new set of code ready for you, and one which conforms more to your original filenames/etc..

The only difference is that InitSNES2.asm is gone -- that thing is ridiculous, and it's all macros now in Macros.asm (for damn good reasons -- I commented why in the code). I also used my own initialisation routine which I've been using for, what, 20 years? I WISH PEOPLE WOULD STOP SCREWING AROUND WITH THE INIT ROUTINES: THERE IS NOTHING TO FIX/OPTIMISE IN THEM. THEY ARE RUN *ONCE* DURING RESET/POWER-ON. JUST USE THE VALUES NINTENDO GIVES YOU IN THE OFFICIAL DOCS AND BE DONE WITH IT. YOU DO NOT NEED LOOPS ETC. (THOSE ARE JUST SLOWER THAN UNROLLED) AND ALL IT DOES IS OBFUSCATE THE CODE. PLEASE STOP WRITING INIT ROUTINES OR "OPTIMISING THEM".

Sprites still don't work, but I have a pretty damn good idea why now after looking at the DMA routine.

And there was so much wrong in this thing, like accidents waiting to happen all over the place, etc.. *sigh*

Re: 16bit table indexing problem
by koitsu on 2015-02-03 (#140484)

Wonderful IBS screwing up my sleep schedule. Work is going to love me tomorrow. *sigh* But I figured while I was up dealing with dumb health problems, I might as well try to figure out the rest -- and I did, so sprites work now.

I fixed/simplified your start_metasprite routine as well (I'm pretty sure it had bugs in it vs. what the comments claimed) + I clarified things in the comments. The counting should make more sense to you now, but whether or not that's really how you want to implement it is up to you. Honestly that routine would, I think, greatly benefit from indirect addressing (so you could have multiple metasprite tables, effectively, and change which one you're referring to by a single load/store pair -- yes really!). I got rid of YTemp/XTemp too (once you see the changes you'll understand why).

I also added a crappy something if you press/hold the Y button, just for sprite movement testing.

There were other bugs/things I had to fix too but at this point I'm so tired and feeling so ugh that I can't remember.

Your whole metasprite + sprite concept here seems very... I don't know. I sort of get the impression you don't understand what SpriteBuf1/SpriteBuf2 do (meaning what they're for and how the SNES uses them), but there's a 50% chance I'm completely wrong and it's just that you haven't written the rest of the code but have designed it somewhere (in which case I apologise + ignore me). But I now understand why you want to "move to 16-bit" -- because the MSB (9th bit) of the Y position of a sprite is separate from the remaining 8 bits. I now actually understand what's needed to make all of that work, but I wish I had a better idea ultimately what your goal was.

And 2input.asm is just... I don't even know what to do with that mess. ;-)

And this time I used .zip just to make Tepples happy (BTW the reason I used RAR: because Espozo did in his posts, so why you pickin' on me brah? j/k).

P.S. -- One of these days I wanna see what psychopathicteen has been working on for like the past 8 billion years.

Re: 16bit table indexing problem
by tepples on 2015-02-03 (#140492)

koitsu wrote:

I WISH PEOPLE WOULD STOP SCREWING AROUND WITH THE INIT ROUTINES: THERE IS NOTHING TO FIX/OPTIMISE IN THEM. THEY ARE RUN *ONCE* DURING RESET/POWER-ON. JUST USE THE VALUES NINTENDO GIVES YOU IN THE OFFICIAL DOCS AND BE DONE WITH IT..

I wonder if people screw with them in order to make it clear that they have intentionally avoided official docs.

I'll look at the zipfile sometime today.

Re: 16bit table indexing problem
by koitsu on 2015-02-03 (#140516)

tepples wrote:

I wonder if people screw with them in order to make it clear that they have intentionally avoided official docs.

The official docs don't include any code -- they just tell you what each register needs to be set to value-wise on reset. So no, it's simply people being ridiculous and for some reason thinking that this one-time-called routine deserves loops and other nonsense (like "don't bother initialising some registers because we set them in the near future anyway" -- WHO CARES, do the init exactly like Nintendo says, do it one time, and stop worrying about the rest!)

Quote:

I'll look at the zipfile sometime today.

The latest version I posted is worse (formatting-wise) than the previous one, because I went back and started from scratch (thinking I had accidentally lost some code or changed something without fully testing each step of the way, thus causing the BG and sprite problems) and was doing a large amount of copy-pasting. I also kept a lot of the original labels and variable names and filenames from Espozo's original code (while the better-formatted one I released with BG/sprite issues had some variables and labels and files renamed).

Re: 16bit table indexing problem
by tepples on 2015-02-03 (#140522)

koitsu wrote:

The official docs [...] just tell you what each register needs to be set to value-wise on reset.

I figured as much. I imagine that these values are $00 for most writable registers, and double-$00 for any register that takes double-writes, with a few exceptions such as the first and last mode 7 matrix registers that need to be $0100. So is there a reliable list of what these values are other than the official docs? If there is a list, an emulator developer could add a feature to warn on writes of incorrect values before all registers have the correct starting values.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-02-03 (#140538)

koitsu wrote:

Your whole metasprite + sprite concept here seems very... I don't know. I sort of get the impression you don't understand what SpriteBuf1/SpriteBuf2 do (meaning what they're for and how the SNES uses them), but there's a 50% chance I'm completely wrong and it's just that you haven't written the rest of the code but have designed it somewhere (in which case I apologise + ignore me). But I now understand why you want to "move to 16-bit" -- because the MSB (9th bit) of the Y position of a sprite is separate from the remaining 8 bits. I now actually understand what's needed to make all of that work, but I wish I had a better idea ultimately what your goal was.

???

SpriteBuf1 has 4bytes per sprite (X and Y positions, character data, and extra) for a total of 512 bytes and Spritebuf2 has each sprites most significant X bit and sprite size selection bit, and it is 32 bytes long. SpriteBuf1 and 2 then get DMAed to OAM during VBlank.

koitsu wrote:

P.S. -- One of these days I wanna see what psychopathicteen has been working on for like the past 8 billion years.

I don't know. He's probably making something insane. :roll:

By the way, I made the metasprite routine use 16 bits (I haven't implemented the 9th x, not y

bit yet, as it currently just gets erased when storing the number in spritebuf1) and it worked flawlessly, but I tried to make it where you add XPosition and YPosition to the number in the table that gets stored in SpriteBuf1, but the assembler didn't like it.

Edit: I forgot to CLC before ADCing, but it didn't make a difference. (I'm really not sure why it would, but I tried.)

Attachment:

Code.png [ 26.16 KiB | Viewed 6334 times ]

Attachment:

Code Extended.png [ 31.61 KiB | Viewed 6333 times ]

Attachment:

Assember Problem.png [ 11.78 KiB | Viewed 6333 times ]

Re: 16bit table indexing problem
by koitsu on 2015-02-03 (#140558)

tepples wrote:

koitsu wrote:

The official docs [...] just tell you what each register needs to be set to value-wise on reset.

I figured as much. I imagine that these values are $00 for most writable registers, and double-$00 for any register that takes double-writes, with a few exceptions such as the first and last mode 7 matrix registers that need to be $0100. So is there a reliable list of what these values are other than the official docs? If there is a list, an emulator developer could add a feature to warn on writes of incorrect values before all registers have the correct starting values.

The init routine in my code (see .zip file, or my old sndoc230.zip archive (see test.lzh/test.zip within that zip)) has all the correct values per official docs. I don't know of anywhere else that lists them off. Many are zero, yes, but not all.

Re: 16bit table indexing problem
by koitsu on 2015-02-03 (#140560)

@Espozo: assembler error in question is because you forgot to .exportzp the XPosition and YPosition variables within MetaspriteTest2.asm. You have to have both .importzp (in the file where you want to use those variables) and an .exportzp (in the file where you declared them) for it to work.

Procs (e.g. subroutines) work the same way (using .export/.import).

Macros, however, don't work that way -- they're not procs, they're just raw code, hence why there's Macros.asm and why I put all macros in there. And I should note that for Macros.asm, it only needs to be .include'd ONCE for the macros to be universally available, which is convenient.

But back to variables: there is also a .globalzp ca65 directive which I didn't have time to explore or tinker with, but I wouldn't get too hung up on that right now.

I'll use this opportunity to point out how annoying the ca65 "zp" stuff is, specifically because the ZEROPAGE segment is only 256 bytes long, while on the 65816 you have direct page which is up to 65536 bytes long (depending on where you place it using lda/tcd). And ca65 absolutely requires a ZEROPAGE segment (incl. in the config template). For 6502/65c02 this works wonderfully, but for 65816 it's irritating. Maybe tepples/thefox have some ideas on how to tweak the template to work better with it, but I simply don't. So basically once you exhaust the ZEROPAGE segment (i.e. you exceed 256 bytes there), it's gonna start throwing errors during link-time, in which case your variables need to end up in the BSS segment (like where the SpriteBufs are).

Re: 16bit table indexing problem
by psycopathicteen on 2015-02-04 (#140620)

Besides from Bad Apple, I've been working on a game called Alisha's Adventure. As to why I keep giving up on projects and starting new ones, it's because there is always that one little architectural problem that makes it difficult to program anything without rewriting a whole bunch of code.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-02-04 (#140622)

You seem to really like female characters, but I guess I'm not one to talk? I've already shown this on the SNES Pixel Art thread, but I fixed it up a bit more. (I'm way to nitpicky for my own good. :oops:

)

Attachment:

player2.png [ 1.2 KiB | Viewed 6287 times ]

I think it's really funny how when you hold down the dash button against a wall how the dashing animation still plays even when you aren't moving. Also, about the plasma Grinch thing, does it use real-time sprite rotation for its limbs?

Re: 16bit table indexing problem
by psycopathicteen on 2015-02-04 (#140628)

It actually rotates everything during the "LEVEL 1" screen.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-02-04 (#140630)

No wonder it takes a while to load... (I assumed it was graphics decompression or something) How do you store all the animation for it moving? WRAM?

Re: 16bit table indexing problem
by psycopathicteen on 2015-02-04 (#140631)

The rotating limbs take up half of wram. The dynamic animation system is kind've like what you described in that other thread, but instead of looking for a box for every individual sprite, it looks for a box for every metasprite.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-02-04 (#140632)

psycopathicteen wrote:

The rotating limbs take up half of wram. The dynamic animation system is kind've like what you described in that other thread, but instead of looking for a box for every individual sprite, it looks for a box for every metasprite.

Then in that case, it would be way easier, just like calculating x and y because you wouldn't need to have the "second y" that I made. You know psychopathic teen, have you worked on the Genesis before, because something about your art style and the fact that you are pretty much only using half of wram kind of reminds me of the Genesis. I guess the advantages to both my code that looks for tiles in vram and yours is that mine is a bit more flexible, meaning it doesn't assume every metasprite uses the same size, but I imagine is a bit more costly. (Only the sprites that are undergoing any kind of animation change are going to go through, so it shouldn't be that big of a deal.)

Re: 16bit table indexing problem
by psycopathicteen on 2015-02-04 (#140634)

It doesn't assume every metasprite is the same size. It just assumes that all metasprites are rectangular.

Re: 16bit table indexing problem
by Drew Sebastino on 2015-02-05 (#140643)

So it allocates different sized metasprites to different sized slots?