Yesterday, I was asking about overflow and underflow and how to check whether a value that was just changed is greater than or equal to (>=) 10.
For the last hand-written Assembly function that I need in my game, I want to do a comparison of two values. (Well, actually a comparison of two arrays, but the array aspect isn't important for my question.)
Both values are set in variables, i.e. I don't compare a variable with a constant value, but I compare two variables.
Now what do I have to do to check for "less than" (<) and "greater than" (>).
I do not need <= and >=, like I did yesterday. Only < and >.
Code:
LDA Value1
CMP Value2
BEQ @equals
XXX @lessThan
YYY @greaterThan
@equals:
; ...
@lessThan:
; ...
@greaterThan:
; ...
What would be XXX and YYY in this case?
CMP puts
< vs
>= in carry. (Unsigned 8 bit, anyway. Signed or 16+ bit is
a little more complicated.)
If you want
> vs
<= then reverse the two terms (CMP the 2nd against the 1st one instead of vice versa).
If you want
< and
> then you may create a third case to eliminate
= by branching on the Z flag (BEQ/BNZ). If you get rid of the
= case then the
>= case is reduced to just
>.
O.k., let's try this:
Code:
LDA Value1
CMP Value2
BEQ @equal
BCS @lessThan
JMP @greaterThan
Correct so far?
You have the carry condition backwards.
Carry is set by CMP if A (Value1) is greater than or equal to its paramter (Value2). If it is less than, carry is cleared by CMP.
BCS = Branch if Carry is Set
BCC = Branch if Carry is Clear
BCS is greater than or equal. In your example equal is checked for, but BCS should be bcc. (And... it should also go before BEQ. Having BEQ first means you check for it on BCC cases, which adds time to them. If a BCC branches, you don't need a beq check, so having it before checking for that range is a bit slower.)
A compare is a subtract. CMP and SBC are identical except CMP does not care about the state of the carry going in, and doesn't affect A with the result.
The carry flag lets you know about overflow/underflow. Typically before starting a subtract with sbc, you set the carry flag with sec. Knowing this, it makes sense that carry flag reverses (clears) on underflow. An underflow occurs when you subtract a value larger than the first. Which means that if the second value is smaller (or equal), there will be no underflow. Thus the carry will stay set.
So... the carry staying set means the value you're subtracting was less than or equal. Which means the first value was greater than or equal to the second value. Because a compare is a subtract.
A little mnemonic that can be helpful in remembering which is which: 0 < 1, so if carry=0 (BCC), it's "less than".
Or, you can make a wrapper macro for it/them, which is what I usually do.
rainwarrior wrote:
Carry is set by CMP if its parameter (Value2) is greater than or equal to register A (Value1). If it is less than, carry is cleared by CMP.
Wait... you got it backwards. CMP is a subtraction. If A holds a big number (Value1) and you subtract a small number (Value2), there's no borrow/underflow, meaning the carry is set. If A holds a small number (Value1) and you subtract a big number (Value2), there's a borrow/underflow, so the carry is cleared.
EDIT: Personally, I don't like the expressions "greater than" and "less than" because I never know which value goes on which end of the expression (e.g. is it "accumulator greater than argument" or "argument greater than accumulator"?), so I much prefer to always think of comparisons as subtractions, and look out for underflows to know which value is larger.
Sorry. I should go to bed.
Yes, my description was wrong, but DRW's example was indeed backwards. I have amended my description.
rainwarrior wrote:
Carry is set by CMP if A (Value1) is greater than or equal to its paramter (Value2). If it is less than, carry is cleared by CMP.
BCS = Branch if Carry is Set
BCC = Branch if Carry is Clear
Many assemblers support mnemonic aliases described by Western Design Center. (Open
65c816.txt and search for "common mnemonic aliases".) If yours does not, you can write a macro, as thefox suggested.
BCS = BGE = Branch if Greater or Equal
BCC = BLT = Branch if Less Than
If you're unsure of the ASM...write it in C code in a blank cc65 file, and see how it compiles it. It takes like 10 seconds to do.
dougeff wrote:
If you're unsure of the ASM...write it in C code in a blank cc65 file, and see how it compiles it. It takes like 10 seconds to do.
I tried, but it created such a strange code. I'm not at home in the moment, but I can post it later.
dougeff wrote:
If you're unsure of the ASM...write it in C code in a blank cc65 file, and see how it compiles it. It takes like 10 seconds to do.
O.k., I know now what I did wrong:
Since my actual build command is used with cl65 and therefore, the Assembly files are only created in memory, but not on the hard disk, I always have to do a manual call to cc65 when I want to see the Assembly code of a source file. And I forgot to turn on optimization for this manual call. That's why the code was so strange. But yeah, with the -O switch, everything looks like it should.
This way of setting the carry if it is greater or equal does help in many cases, for example if you are doing CMP #10 in order to do carrying for decimal arithmetic; you might not need a branch in such a case.
I would possibly structure it like:
Code:
lda value1
cmp value2
bcc lessThan
beq endif
; if value1 > value2
jmp endif ; (or use a branch if flag condition will be known)
lessThan:
; if value1 < value2
endif:
Movax12 wrote:
I would possibly structure it like:
Is there any real difference between checking for < first and then checking for =? Because my current intention was to check for BEQ first, then for BCC. So, is there any actual difference?
Yes. BEQ covers just one cmp result. It's also exclusive from BCC after a compare. So if you do beq before bcc, you're cutting out one value from a group that may not need that value cut out from it. With BCC first, you can sieve out a much larger range of values. With BEQ first, you lose two cycles on a much larger range of numbers, than if you do BCC first, where you add two cycles to just one case. (Zero.) Edit2: Well, you also add the two cycles to all the BCS cases, but if you need to perform separate logic on <, >, and = something's gotta lose, and BCS is not exclusive from BEQ after a compare. If you are doing all three, BCS is usually a better first branch.
You said in the other topic the goal is performance, and it's usually faster. (But it depends on what you're comparing. Sometimes 0 really is the most common case.)
Edited to be hopefully be better explained.
The difference is about time, not functionality. If you don't care how long it takes, then it doesn't matter.
If you do care, you'll want to arrange your code so that the most common case takes the shortest/fastest branch.
(Edit: missed Kasumi's post so this is redundant.)
One obvious optimization is to start the comparison from the most significant digit and move your way down, because that will allow you to find the result without having to look at all the digits. You keep comparing while the digits in both numbers are the same, but as soon as you find one that's different, you'll have your answer. For example, when comparing 1350 and 1510, comparing 1 and 1 will not give you an answer, but comparing 3 and 5 will, so there's no need to compare the test.
With this in mind, it may be a good idea to have a BNE after each comparison branch away to a location that will use BCC/BCS to tell which number is larger (regardless of the position of the digit). Like this:
Code:
lda num1+n
cmp num2+n
bne decide
lda num1+n-1
cmp num2+n-1
bne decide
(...)
lda num1+0
cmp num2+0
bne decide
;num1 = num2
decide:
bcs greater than
;num1 < num2
greaterthan:
;num1 > num2
The loop is unrolled for speed, so you don't have to update X or check for end conditions with such small arrays (surely you don't have not than 8 digits to compare?).
tokumaru wrote:
One obvious optimization is to start the comparison from the most significant digit and move your way down, because that will allow you to find the result without having to look at all the digits.
Yes, I know. That's why I said: The array aspect doesn't really have much to do with my question, only the question about a single comparison.
That's the way I did it now:
Code:
LeftEqualsRight = 0
LeftLessThanRight = 1
LeftGreaterThanRight = 2
.segment "ZEROPAGE"
_CompareArraysPointerLeft: .res 2
.exportzp _CompareArraysPointerLeft
_CompareArraysPointerRight: .res 2
.exportzp _CompareArraysPointerRight
Size: .res 1
.segment "CODE"
_CompareArrays:
.export _CompareArrays_
STA Size
LDY #0
@loop:
LDA (_CompareArraysPointerLeft), Y
CMP (_CompareArraysPointerRight), Y
BCC @leftLessThanRight
BEQ @leftEqualsRight
LDA #LeftGreaterThanRight
JMP @end
@leftLessThanRight:
LDA #LeftLessThanRight
JMP @end
@leftEqualsRight:
INY
CPY Size
BNE @loop
LDA #LeftEqualsRight
@end:
LDX #0
RTS
The pointers are set in the code in C before calling the function.
By the way, is there any way to do this without the Size variable?
If you're comparing two multi-byte numbers but not doing a 3-way comparison, if you're fine with just knowing whether a < b or b >= a, you can just do most of the subtraction and ignore the result except for the carry. This method doesn't waste bytes on intermediate branches or cycles on untaken branches.
Code:
lda b_lo
cmp a_lo ; Use CMP instead of SBC to avoid needing to SEC
lda b_mid
sbc a_mid ; Use SBC instead of CMP to respect previous carry
lda b_hi
sbc a_hi
bcc a_is_less
; here: A is greater or equal
tepples wrote:
If you're comparing two multi-byte numbers but not doing a 3-way comparison
It's a general purpose function. Sometimes I need to check for equal, sometimes for less than, sometimes for greater than. So, the function needs to be able to handle each case.
I though you were shooting for speed? Your loop looks significantly slower than than the unrolled loop I posted... If you need to compare numbers of different sizes, you can just have multiple entry points (load Y with Size - 1 and JSR to the appropriate entry point):
Code:
Compare5Digits:
lda (_CompareArraysPointerLeft), y
cmp (_CompareArraysPointerRight), y
bne Decide
dey
Compare4Digits:
lda (_CompareArraysPointerLeft), y
cmp (_CompareArraysPointerRight), y
bne Decide
dey
Compare3Digits:
lda (_CompareArraysPointerLeft), y
cmp (_CompareArraysPointerRight), y
bne Decide
dey
Compare2Digits:
lda (_CompareArraysPointerLeft), y
cmp (_CompareArraysPointerRight), y
bne Decide
dey
Compare1Digit:
lda (_CompareArraysPointerLeft), y
cmp (_CompareArraysPointerRight), y
bne Decide
lda #LeftEqualsRight
rts
Decide:
bcc +
lda #LeftGreaterThanRight
rts
+ lda #LeftLessThanRight
rts
I know that coming from a high-level language your first instinct is to use a loop, handle different sizes using a parameter and a variable, and have only one return point, but now you have the power of assembly! If you really are going for speed, you should consider these kinds of optimizations.