The ca65 assembler comes with a setting for no mnemonics (
.setcpu "none") and a rich macro system. I plan to use this to reimplement 6502 instruction syntax the hard way, making each instruction a macro that outputs .byte and .word instructions. You might wonder why one would try this, given that ca65 already supports the 6502. I intend for it to act as an example of how ca65 would be adapted to other instruction sets, such as Z80, GBZ80, SPC700, and MC68000, without a recompile. Or 6502/65816 using x86-style mnemonics (hello nocash) or SPC700 or MC68000 using 6502-style mnemonics (
hi byuu).
I started the project yesterday, and I've already got the ALU block (%xxxxxx01) done along with bits and pieces of the control block (%xxxxxx00) and the unofficial RMW+ALU combined block (%xxxxxx11). The biggest hurdle I've found so far is in branches, to determine whether the distance to a future label is more than 129 bytes. The long branch macro pack that comes with ca65 just punts on the issue, making all forward branches long.
ca65 normally chokes on branches that are too far. It is one-pass, so I don't think you can do much about that.
I was interested in
.feature ubiquitous_idents documented here:
http://www.cc65.org/snapshot-doc/ca65-11.html#ss11.42It says it allows overloading, but it seems to do the same thing as
.setcpu "none"For your goal, maybe it is worth looking at the C source?
ASM6 was a three-pass assembler just because of the variable lengths of instructions. Variable length instructions screw over any assembler.
Wow, that certainly is meta Tepples
I can see how with such a system in place you could develop a very tightly integrated HLA package.
Movax12 wrote:
ca65 normally chokes on branches that are too far.
So if I'm coding a branch's signed relative operand as a .byte, how do I express to ca65 that it is a branch so that it can choke only when too far instead of choking with "Constant expression expected" on every forward branch? Remember that the branch opcodes in one or more of the CPUs that I plan to support aren't necessarily the same values as on a 6502. For example, how would I tell ca65 to range-check forward BBC and BBS (in the x3 column of the
SPC700 opcode matrix) branches?
qbradq wrote:
I can see how with such a system in place you could develop a very tightly integrated HLA package.
That's sort of what Movax12's
IF macro package is supposed to do.
Tepples, if I understand you right, you want to encode a signed byte containing the difference between two addresses, and want an error when the difference can't be represented in a signed byte. If so, can you just put a .if that checks this condition? As I remember, ca65 allows almost anything in a .if, and its evaluation is deferred until link time.
blargg wrote:
Tepples, if I understand you right, you want to encode a signed byte containing the difference between two addresses, and want an error when the difference can't be represented in a signed byte.
Correct. An explicit check with .if works for backward references, but not for forward references. As far as I can tell, the only forward references that get range-checked in this way are the branch instructions of the assembler's predefined instruction sets.
Quote:
As I remember, ca65 allows almost anything in a .if, and its evaluation is deferred until link time.
I thought .if required a constant expression. From the manual:
Quote:
An expression used in the .IF command cannot reference a symbol defined later, because the decision about the .IF must be made at the point when it is read.
tepples wrote:
blargg wrote:
Tepples, if I understand you right, you want to encode a signed byte containing the difference between two addresses, and want an error when the difference can't be represented in a signed byte.
Correct. An explicit check with .if works for backward references, but not for forward references. As far as I can tell, the only forward references that get range-checked in this way are the branch instructions of the assembler's predefined instruction sets.
Quote:
As I remember, ca65 allows almost anything in a .if, and its evaluation is deferred until link time.
I thought .if required a constant expression. From the manual:
Quote:
An expression used in the .IF command cannot reference a symbol defined later, because the decision about the .IF must be made at the point when it is read.
Try using
.assert. If needed, its evaluation will be deferred until link time.
Yes; if you are looking to output friendly errors, rather than solve forward branching, you can do so with an .assert statement.
I'm having a lapse because I'm not imagining the problem with this:
Code:
.byte target-*
...
target:
That encodes the branch offset, then you use an assert to verify that it's in range.
I think that should work.
Thanks Movax12 and blargg. This is what I came up with so far for branches:
Code:
.macro NONE02_branch inst, target
.local distance
distance = (target) - (* + 2)
.assert distance >= -128 && distance <= 127, error, "branch out of range"
.byte inst, <distance
.endmacro
.macro bpl target
NONE02_branch $10, {target}
.endmacro
.macro bmi target
NONE02_branch $30, {target}
.endmacro
Right now I'm 25% of the way through the opcode matrix. I'll upload it for all of you to bang on once I reach ISC $FFFF,x.
I was considering doing this with my macro code, since I like to represent code like this:
Code:
lda foo+$10,x
as:
Code:
lda foo[ $10 + x]
So I would be interested to know what the performance ends up being like when building a large project. I might look into doing the same thing. Or borrowing your code, tepples. I felt this would be better implemented in source code as an alternative branch of ca65, but if performance is okay, there is no need.
I started this project for at least two reasons.
One is that I wanted to see what other 8- and 16-bit assembly languages would look like in 6502 drag. I have an idea of how to do 68K, and it starts by renaming D0-D7 to A-H and A0-A7 to Z-S. This preserves Y (A1) and X (A2) as performing indexing-related operations, and S is conveniently the stack pointer.
The other is that I'm following up on my
promise from two years ago. I didn't want to have to require people who build on my libraries to install both ca65 and bass for Super NES projects, and I thought there'd be resistance to implementing other 8- and 16-bit CPUs' assembly languages directly in the source code of ca65. I found a bunch of mentions of a Sunplus CPU whose instruction set is "proprietary and confidential". SPC doesn't stand for "
Sunplus chip", does it? Did Sony and Sunplus collaborate?
Wow, this is going to be real. I had taken a stab a while back at gb-z80 but got hung up on the foundation macros. I hope we can get gb-z80 and spc-700 into this. No more assembler cocktail.
Code:
.macro nop
.byte $EA ; If it's in the game, it's in the game.
.endmacro
Took me a moment. When I was little and saw $EA in disassemblies on the Apple //, I thought it was because it was an EA game and they plastered their initials in unused areas of memory.
EDIT: gave this a try and it assembled its test opcodes so well I had to try it on my library code and a test. After lots of fixes it assembled it as-is and ran fine. It looks like the only issue was that variables in the .zp segment used absolute addressing. I'm seeing whether there is a way to determine this in a macro.
A few places needed an additional .const condition to avoid trying to compare a non-constant.
Many places needed @ prefixes on the .local symbols (you'd think that .local was enough), otherwise they broke @ symbols in the using code. I only changed those needed to compile my code, so there are more (I tried a global search-and-replace, but got some weird errors so backed off).
Very impressed!
Code:
.if (arg) < $100
; .out .sprintf("ZP %x", argvalue)
.byte $04 | (inst), argvalue
These kind of checks won't work if
arg is a forward reference or if it is imported from another module. This is the reason why ca65 has separate
.import and
.importzp control commands (it always needs to know the address size of the symbols as it's generating the code). I guess in theory it could solve the
.if foo < $100 as a special case if the symbol has been imported as zero page, but I don't think it's smart enough to do that (yet?). And I don't think there's any way to check from code if a symbol has been defined as being on zeropage or not.
If ca65 doesn't know the address size of a symbol (i.e. it's a forward reference), it uses absolute addressing for it, and if it later realizes the address would've fit in 8 bits, it spits out a warning:
Code:
lda foo
foo = 123
>cl65 -t none forward-ref.s
forward-ref.s(1): Warning: Didn't use zeropage addressing for `foo'
To solve zeropage you need one of these:
Identifier must be constant at assemble. (.res doesn't get you this)
Explicit zeropage syntax specified in code and evaluated by the macro
You could declare your zeropage identifiers as:
foo = $00
And so on and there would be no issue.
You could also take advantage of .struct
Code:
.struct
foo .byte
bar .byte
.endstruct
This would work well for single module projects, but won't work for multiple modules.
For syntax:
ca65 supports (by default) a
z: prefix to force zeropage, or the lowbyte operator will work as well
<Maybe copy that idea?
So is there a way to fix up the macros, or would one have to give up and implement other assembly languages directly in the C source code of ca65?
What else needs to be fixed up? I think this zeropage issue is the only thing that is an issue and I think it can be worked around. I suppose you lose compatibility with regular ca65 source files.
GB-Z80 has as I remember a special syntax for its direct page ($FFxx). And SPC-700 ! before absolute addresses.
In my code I always use < where timing is important, since you can never trust the assembler to use zero-page anyway. So it's an only optimization issue in my eyes. Compared to the other assemblers I'm using, I'd take this to have ca65's goodness (like great macros that don't cower at a task like this) any day.
I see what you mean: ca65's .if statement isn't smart enough to determine that a label explicitly imported as zero page is less than $100.
Code:
; Import tests
.importzp boo
lda boo
produces
Code:
ca65none.s(835): Error: Constant expression expected
ca65none.s(304): Note: Macro was defined here
ca65none.s(247): Note: Macro was defined here
So how should I change the macros to incorporate useful workarounds?
tepples wrote:
So how should I change the macros to incorporate useful workarounds?
If you don't mind losing ca65 compatibility, you could make it work similar to NESASM, where < in the beginning would use zero page addressing, and otherwise it would always use absolute addressing. When you use absolute addressing, you could of course also set up asserts (maybe user configurable?) so that the user gets a warning if absolute addressing is used when it's not needed. Naturally assert could also be useful if user specifies zero page addressing but the address ends up being >= $100.
Or you could coerce somebody to add a new pseudo function to ca65 that returns the address size of a symbol.
That might be reasonable if this is indeed the only problematic part of the implementation.
This may not be of interest to you, but from a HLL language implementer's perspective having to explicitly specify that a reference is in zero-page is a bummer. That said I already have to do that for my language that targets ca65 due to how imports and exports work.
thefox wrote:
..you could coerce somebody to add a new pseudo function to ca65 that returns the address size of a symbol.
This is the best idea, I don't know how open the new maintainer is to ideas like this though.
Another idea (workaround with nice syntax):
Code:
; define macros
.macro setZP ident
.scope ident
isZP = 1
.endscope
.endmacro
.macro myImportZP ident
.importzp ident
setZP ident
.endmacro
; ... somewhere else in source:
.zeropage
foo: .res 1
bar: .res 2
setZP foo
setZP bar
; inside tepples' macro code:
.ifdef arg::isZP
; implement ZP addressing
.else
; implement other
.end
Movax12 wrote:
thefox wrote:
..you could coerce somebody to add a new pseudo function to ca65 that returns the address size of a symbol.
This is the best idea, I don't know how open the new maintainer is to ideas like this though.
He's been accepting quite a number of pull requests on github:
https://github.com/oliverschmidt/cc65/p ... ate=closed
On the one hand, having an operator that allows implementing the instruction set entirely with macros seems like a worthwhile thing, just because it means that its facilities are "complete" in some way (assuming there aren't other problems as well). On the other hand, this is only needed to implement the 65xx instruction set, which the assembler already implements.
blargg wrote:
On the other hand, this is only needed to implement the 65xx instruction set, which the assembler already implements.
It's also needed for SPC700, which the assembler does not already implement.
If you fork ca65, implement the feature, then demonstrate the 6502 reimplementation can work with the fork, I think there's a good chance it will be accepted into the main branch. I don't think it's a good idea to bring it to the main branch before it has been demonstrated to be useful.
The missing feature is a function to get a label's address size. I think the place to start is to use .referenced(label) or something as a model to make .addrsize(label), which queries SymEntry::AddrSize defined in symentry.h. A quick
search of the source code reveals that changes would be needed in expr.c (translate TOK_REFERENCED into calls to FuncReferenced), scanner.c (translate ".REF" and ".REFERENCED" into TOK_REFERENCED), and pseudo.c (handle unexpected occurrences of ".REFERENCED"). Who has more experience hacking ca65?
I decided to try and add this. I added
.addrsize(ident), and it works..for an identifier, but it doesn't solve the problem, since there is often code with an offset added, such as:
Code:
lda foo + 3
To decide what addressing mode that example is would take a lot more macro code and searching for identifiers etc. You could assume a simple case like that, and get away with it most of the time, but using scopes and more complex expressions won't work. I'm not sure how ca65 decides by default what to do actually.
To really do this right I think it should be implemented in the ca65 source.
BTW files changed:
expr.c
pseudo.c
scanner.c
token.h
A lot of the macros in this pack already make a new local label. So long as the "guessed address size" of an expression like
foo + 3 is reasonable (and constant enough to get used in .if), it should still work.
Code:
.importzp foo
argvalue = foo + 3
.if .addrsize(argvalue) = NONE02_SIZE_ZEROPAGE
.out .sprintf("%d", .addrsize(argvalue))
.endif
As for implementing new ISAs directly in instr.c: There might be little resistance to implementing SPC700, but Z80 and 68000 might meet more. Working in macros also allows more experimentation with the syntax because no waiting for the assembler itself to compile and link.
Actually, you are correct, it does guess well enough, I think. I got things working with my build environment.
Updated source:
https://github.com/Movax12/cc65/tree/master/src/ca65Note: .addrsize() will return a value from 1 to 4 for addressing size (number of bytes needed for the address). It will return 0 for unknown rather than error out. Makes things easier.
I edited the ca65none.s: All labels are now cheap locals. I used
.feature ubiquitous_idents due to the fact that ca65 complains about the use of a forced
:absolute in my code (says not valid for cpu type). Since you have defined all the instructions, it won't matter in this case. (main.c has the fix for ubiquitous_idents).
Please let me know if anything doesn't work.
tepples wrote:
So long as the "guessed address size" of an expression like foo + 3 is reasonable (and constant enough to get used in .if), it should still work.
As I understand it, if "foo" is an imported zero page variable, ca65 will assume that "foo + 3" fits in zero page as well. If it doesn't, it will give a fatal "Range error" at link time. You may want to duplicate this behavior with an assert. (I assume Movax's .addrsize returns 1 for a symbol constructed from the expression "foo + 3".)
I.e.
Code:
; foo.s
.importzp foo
lda foo + 3
; bar.s
.exportzp foo = 255
; > cl65 -t none foo.s bar.s
ld65.exe: Error: Range error in module `foo.s', line 3
thefox wrote:
..ca65 will assume that "foo + 3" fits in zero page as well. If it doesn't, it will give a fatal "Range error" at link time. You may want to duplicate this behavior with an assert.
I tested that scenario. There is no need for an assert, ca65 still complains with "Range error". I suppose since it is out of range for the
.byte statement. What needs to be added, however is an assert for when a forward reference is made:
Code:
.if .addrsize(@argvalue) = 1
.byte $04 | (inst), @argvalue
.else
.byte $0C | (inst)
.word @argvalue
.assert @argvalue >= $100, warning, "zeropage addressing could have been used here"
.endif
I love how attempts are being made to closely duplicate ca65's behavior with its macros. I guess next you'll need to reimplement its macro package with itself. ca65inception.s.
I never metacircular evaluator I didn't like!
blargg wrote:
I love how attempts are being made to closely duplicate ca65's behavior with its macros. I guess next you'll need to reimplement its macro package with itself. ca65inception.s.
It would also be funny if somebody wrote a compiler that generates code as ca65 macros. Could even be quite useful, because the ca65 macro language isn't always that nice to use.
thefox wrote:
It would also be funny if somebody wrote a compiler that generates code as ca65 macros.
If it turns out that the SPC700 is as close to the 6502 as I think it is, the second iteration of this macro pack may turn out to be just that: something you can stick at the top of cc65-generated assembly files to turn cc65 into an SPC700 compiler.
Way back Anti-Resonance pointed me to another chip with SPC-700's same CPU core. I had to
re-find it again. It's the
GMS800 series. For example, the
GMS81C3004 (page 17 onward) should be familiar. Note how the absolute addressing mode has a ! prefix (pages 84-89). This might help as other references for the instruction syntax.
tepples wrote:
thefox wrote:
It would also be funny if somebody wrote a compiler that generates code as ca65 macros.
If it turns out that the SPC700 is as close to the 6502 as I think it is, the second iteration of this macro pack may turn out to be just that: something you can stick at the top of cc65-generated assembly files to turn cc65 into an SPC700 compiler.
Not sure if you misunderstood me, or I'm misunderstanding you, but what I meant was a compiler that would take stuff in whatever language and output a program as ca65 macros (to be ran at compile time).
thefox wrote:
take stuff in whatever language and output a program as ca65 macros (to be ran at compile time).
Macros ultimately output data or code. What would be be the benefit of using them as an intermediate step? Example?
Write a language parser for X in preferred language, output as ca65 macros, then program in X using ca65 and these macros.
I think I get it. Sounds like a nice way to implement high level functions, but could be difficult, since macros can't properly loop, just simulate looping with recursion.
Movax12 wrote:
macros can't properly loop, just simulate looping with recursion.
Is this recursion, or is it looping?
Code:
; Language=Scheme
; Multiplies the factorial of x by acc.
(define (factorial* x acc)
(if (zero? x)
acc
(factorial* (- x 1) (* x acc)) ))
; Calculates the factorial of a nonnegative integer.
(define (factorial acc) (factorial* x 1))
If ca65 doesn't correctly optimize macro tail recursion, on the other hand...
I don't know Scheme, but in that case I would guess it is actual looping with a variable stack that allows for recursive algorithms. Macros just keep expanding and executing from top to bottom.
Let me translate it into a language that uses slightly more familiar syntax:
Code:
# Guido likes pretty tracebacks better than tail call optimization,
# but another Python implementation might offer it in a future statement
def factorialTimes(x, acc):
if x <= 0:
return acc
else:
return factorialTimes(x - 1, x * acc)
def factorial(x):
return factorialTimes(x, 1)
Or in a fictitious assembly language:
Code:
; Returns X! * A in A
.proc factorialTimes
cpx #0
bne is_nonzero
rts
is_nonzero:
mul x ; A <- A * X
dex
jmp factorialTimes
.endproc
; Returns X! in A
.proc factorial
lda #1
jmp factorialTimes
.endproc
When you return the result of a function call, you can just JMP to the new function instead of doing the equivalent of a JSR immediately followed by RTS. Likewise, macro A calling macro B on the last line of A shouldn't eat up a stack level.
But that's all really beside the point, as these macros aren't very deeply nested, and .addrsize allows the project to go forward.
tepples wrote:
macro A calling macro B on the last line of A shouldn't eat up a stack level.
Is see more what you are saying. It is besides the point (Though I think it is a stretch to call it a tail call in a macro.) Regardless, even if the end result of the logic is the same with a goto/loop the macros are just expanded as they are "called" like an inline function. With ca65 at least, there is a stack for things like local symbols and parameter names. Pretty sure it doesn't check for tail calls, though I guess there would be no harm.
I found another related deficiency that's white-elephantish even for "normal" uses of ca65.
Code:
; ca65 2.14.0 refuses to recognize "high byte of a zero page
; label" as an expression whose constant value is 0.
.importzp a_zp_label
.ifconst >a_zp_label
.out "zp label high byte is const"
.else
.out "zp label high byte is not const"
.endif
Result:
zp label high byte is not const
Does ca65 perhaps support a direct page at a page other than zero, as the 65816 does? That would be nice.
I get the same output even when I explicitly exclude 65816 features.
Code:
.p02 ; exclude relocatable direct page of 65816
.importzp a_zp_label
.ifconst >a_zp_label
.out "zp label high byte is const"
.else
.out "zp label high byte is not const"
.endif
It's worth adding to this thread that
.addrsize is now part of the main repository, so full implementation of custom instructions as described in this thread is now possible in ca65.
github:
https://github.com/cc65/cc65Documentation:
http://cc65.github.io/doc/ca65.html#ss10.1
Movax12 wrote:
I edited the ca65none.s: All labels are now cheap locals. I used .feature ubiquitous_idents due to the fact that ca65 complains about the use of a forced :absolute in my code (says not valid for cpu type).
I filed an issue about this.
Enable all address sizes for CPU none (#939)(Bumped while revisiting a bunch of my older, smaller projects for
little things)
Reading through this topic reminded me of a problem I ran into with a Super NES rom hack and ca65. You probably know 65816 handles zeropage like this:
Code:
lda #$1800
tcd
lda $72 ; loads from $1872
I modified part of the game's controller checking routine to shave off a few bytes and tried writing this to load directly from $72 (so as not to waste bytes modifying the directpage register):
Code:
lda $0072 ; whoops, the assembler helpfully cuts off the zeroes and it still becomes a zeropage instruction.
Writing out literal machine code to force a 16-bit read of that 8-bit address got around the issue. I didn't see anything in the ca65 doc to indicate forcing zeropage on a $00xx address could be turned off. This is probably a rare problem (when writing a normal program you would just use "tcd" at will) but it looks like the assembler taking control away from the programmer. :p
In stock ca65, you'd force absolute mode with lda a:$72 or far mode with lda f:$72.
strat wrote:
Code:
lda $0072 ; whoops, the assembler helpfully cuts off the zeroes and it still becomes a zeropage instruction.
Even if assemblers used leading zeroes as an indication of an operand's size (AFAIK most don't), you're not supposed to be typing hardcoded address when coding in the first place, as it makes variables hard to trace and reposition, a problem commonly solved with the use of symbols/labels. For this reason, it makes more sense for assemblers to provide a way to force operand sizes that's compatible with symbols as well as with numbers, like ca65 does.