This is related to Hamming codes, as well as plain parity. Had a discussion on IRC and we came up with some ideas on how to implement it. Posted here so it doesn't vanish off the Internet anytime soon.
Bogax's;tpw_rules' method (new links: https://web.archive.org/web/20100225124 ... alogue.htm http://reveng.sourceforge.net/crc-catalogue/ )
Kind of a reference implementation.
Note: Incomplete
You can use the associative property for XOR, to get the parity of the group of bytes, by simply finding the parity of the XOR8 hash. Doing a separate XOR8 hash for odd and even bytes is also possible.
Code:
OneByteParity:
;Takes A, counts the number of 1s, and returns LSb in A.
;Divide-and-conquer method (using ZP for work)
;By alphamule
STA $00; Can replace this with any temp value location
LSR 4; Divide A by 16 using 4 Logical Shift Right instructions.
EOR $00; 4-bit pieces
STA $00
LSR 2
EOR $00; 2-bit pieces
STA $00
LSR
EOR $00; Now A contains the result of XORing all 1-bit pieces
RTS
;19 bytes, 3+8+3+3+4+3+3+2+3=32 cycles (not counting entrance/RTS)
;Takes A, counts the number of 1s, and returns LSb in A.
;Divide-and-conquer method (using ZP for work)
;By alphamule
STA $00; Can replace this with any temp value location
LSR 4; Divide A by 16 using 4 Logical Shift Right instructions.
EOR $00; 4-bit pieces
STA $00
LSR 2
EOR $00; 2-bit pieces
STA $00
LSR
EOR $00; Now A contains the result of XORing all 1-bit pieces
RTS
;19 bytes, 3+8+3+3+4+3+3+2+3=32 cycles (not counting entrance/RTS)
Code:
;Kevtris' 1st method:
sta 00h: 7x(lsr 00h : adc 00h) ;15 bytes, 3+7(5+3)=59cycles
sta 00h: 7x(lsr 00h : adc 00h) ;15 bytes, 3+7(5+3)=59cycles
Code:
;Kevtris' 2nd method:
ldx #$0 : lsr a : bcc + inx : + lsr a : bcc : + lsr a : bcc : + lsr a : bcc : + lsr a : bcc : + lsr a
: bcc : + lsr a : bcc : + lsr a : bcc : + lsr a : bcc
ldx #$0 : lsr a : bcc + inx : + lsr a : bcc : + lsr a : bcc : + lsr a : bcc : + lsr a : bcc : + lsr a
: bcc : + lsr a : bcc : + lsr a : bcc : + lsr a : bcc
Code:
;Using 16-bit table
;By alphamule
STA $00
LSR 4
EOR $00
AND #$0F
LDA ParityTable16,X
;13 bytes, 3+8+3+2+(4 to 5)=20 to 21 cycles
;By alphamule
STA $00
LSR 4
EOR $00
AND #$0F
LDA ParityTable16,X
;13 bytes, 3+8+3+2+(4 to 5)=20 to 21 cycles
Code:
;Using 256-bit table
;Too obvious to credit anyone. LOL, but thanks Fys+Kevtris for reminding me (more than once)!
TAX
LDA ParityTable256,X
;4 bytes, 2+(4 to 5)=6 to 7 cycles
;This can be reduced to 3 bytes and 4 cycles if careful. LDA affects Z/N flags.
;Pushing and pulling X would be safe but add 2 bytes and a few cycles.
;If not using a mapper, LSR should be safe on ROM, and not touch A (best to pass X).
;LSR version: X is input, C flag is output, 3 bytes, _7_ cycles
;Too obvious to credit anyone. LOL, but thanks Fys+Kevtris for reminding me (more than once)!
TAX
LDA ParityTable256,X
;4 bytes, 2+(4 to 5)=6 to 7 cycles
;This can be reduced to 3 bytes and 4 cycles if careful. LDA affects Z/N flags.
;Pushing and pulling X would be safe but add 2 bytes and a few cycles.
;If not using a mapper, LSR should be safe on ROM, and not touch A (best to pass X).
;LSR version: X is input, C flag is output, 3 bytes, _7_ cycles
Bogax's
Code:
;I think I am missing part of this
;lsr ; eor $original ; and #$55 ; adc #$33 ; and #$44 ; adc #$3C
STA $00
LSR
EOR $00
AND #$55
ADC #$33
AND #$44
ADC #$3C
;Those ANDs and ADCs look like they're using the relationship of addition with AND+XOR and it _should_
;work, but not sure what went wrong. Will have to look closely at bit flow and logic tables...
;Edit: Thanks tepples - seems that the original was slightly different. When I tried the one above, it
;would not get different results when I changed the input bits. Just a copy of a copy degradation. ;)
Original using AA,66,88,78:
LDA WORD
ASL
EOR WORD
AND #b10101010
ADC #b01100110
AND #b10001000
ADC #b01111000
;lsr ; eor $original ; and #$55 ; adc #$33 ; and #$44 ; adc #$3C
STA $00
LSR
EOR $00
AND #$55
ADC #$33
AND #$44
ADC #$3C
;Those ANDs and ADCs look like they're using the relationship of addition with AND+XOR and it _should_
;work, but not sure what went wrong. Will have to look closely at bit flow and logic tables...
;Edit: Thanks tepples - seems that the original was slightly different. When I tried the one above, it
;would not get different results when I changed the input bits. Just a copy of a copy degradation. ;)
Original using AA,66,88,78:
LDA WORD
ASL
EOR WORD
AND #b10101010
ADC #b01100110
AND #b10001000
ADC #b01111000
Code:
;Using carry/negative flags?
;Never even bothered as this seems useless.
;Never even bothered as this seems useless.
Kind of a reference implementation.
Code:
TwoByteSplitParity:
;This finds the parity bits of branches of a binary tree.
;In other words, even-single bits, then even pairs, even nibbles, and so on upto both bytes.
;This is the naive recursive-parity method, using OneByteParity with A passing data in/out.
;By alphamule
LDA #$00; Clears the parity bits.
STA Parity
LDA EvenByte
JSR OneByteParity
ROL 3; Repeat 3 times for 8's place. This is actually 3 ROL instructions, obviously.
EOR Parity
STA Parity
LDA EvenByte
AND #$F0
JSR OneByteParity
ROL 2; 4's place
EOR Parity
STA Parity
LDA EvenByte
AND #$CC
JSR OneByteParity
ROL; 2's place
EOR Parity
STA Parity
LDA EvenByte
AND #$AA
JSR OneByteParity
EOR Parity; Goes directly into 1's place so no ROL
STA Parity
;Now for the odd byte
LDA OddByte
JSR OneByteParity
TAX; 1's place in X - this'll make sense at the end
LDA OddByte
AND #$F0
JSR OneByteParity
ROL 2; 4's place
EOR Parity
STA Parity
LDA OddByte
AND #$CC
JSR OneByteParity
ROL; 2's place
EOR Parity
STA Parity
LDA OddByte
AND #$AA
JSR OneByteParity
EOR Parity; Goes directly into 1's place
STA Parity
;We now have the first 4 powers of 2. Next is both bytes.
TXA
AND #$01
ROL 3; 8's place
EOR Parity
AND #$08
ROL; 16's place
EOR Parity
STA Parity
RTS
;This finds the parity bits of branches of a binary tree.
;In other words, even-single bits, then even pairs, even nibbles, and so on upto both bytes.
;This is the naive recursive-parity method, using OneByteParity with A passing data in/out.
;By alphamule
LDA #$00; Clears the parity bits.
STA Parity
LDA EvenByte
JSR OneByteParity
ROL 3; Repeat 3 times for 8's place. This is actually 3 ROL instructions, obviously.
EOR Parity
STA Parity
LDA EvenByte
AND #$F0
JSR OneByteParity
ROL 2; 4's place
EOR Parity
STA Parity
LDA EvenByte
AND #$CC
JSR OneByteParity
ROL; 2's place
EOR Parity
STA Parity
LDA EvenByte
AND #$AA
JSR OneByteParity
EOR Parity; Goes directly into 1's place so no ROL
STA Parity
;Now for the odd byte
LDA OddByte
JSR OneByteParity
TAX; 1's place in X - this'll make sense at the end
LDA OddByte
AND #$F0
JSR OneByteParity
ROL 2; 4's place
EOR Parity
STA Parity
LDA OddByte
AND #$CC
JSR OneByteParity
ROL; 2's place
EOR Parity
STA Parity
LDA OddByte
AND #$AA
JSR OneByteParity
EOR Parity; Goes directly into 1's place
STA Parity
;We now have the first 4 powers of 2. Next is both bytes.
TXA
AND #$01
ROL 3; 8's place
EOR Parity
AND #$08
ROL; 16's place
EOR Parity
STA Parity
RTS
Note: Incomplete
Code:
TwoByteSplitParity:
;This finds the parity bits of branches of a binary tree.
;In other words, even-single bits, then even pairs, even nibbles, and so on upto both bytes.
;This is the 256x4-bit-table method, using EOR for the 5th bit(bit 4; 16's place).
;By alphamule
LDX EvenByte
LDA BinaryParityTable,X
STA Parity; Overwrites previous value, so no clearing needed
LDX OddByte
LDA BinaryParityTable,X
EOR Parity
STA Parity
RTS
;22 bytes code+256 bytes data=278 bytes of ROM
;2 bytes of RAM
;The code size can be reduced to 15 bytes if using ZP
;4+4+4+4+4+4+4 = 28 cycles for non-ZP, not counting entrance or RTS (this can be macroed if need be)
;This finds the parity bits of branches of a binary tree.
;In other words, even-single bits, then even pairs, even nibbles, and so on upto both bytes.
;This is the 256x4-bit-table method, using EOR for the 5th bit(bit 4; 16's place).
;By alphamule
LDX EvenByte
LDA BinaryParityTable,X
STA Parity; Overwrites previous value, so no clearing needed
LDX OddByte
LDA BinaryParityTable,X
EOR Parity
STA Parity
RTS
;22 bytes code+256 bytes data=278 bytes of ROM
;2 bytes of RAM
;The code size can be reduced to 15 bytes if using ZP
;4+4+4+4+4+4+4 = 28 cycles for non-ZP, not counting entrance or RTS (this can be macroed if need be)
You can use the associative property for XOR, to get the parity of the group of bytes, by simply finding the parity of the XOR8 hash. Doing a separate XOR8 hash for odd and even bytes is also possible.
Code:
;Post-parity method using XOR8 hash
;By alphamule
;Assuming XOR8 result in A register.
STA XOR8Sum
ROR 4 ; ROR repeated 4 times for brevity
EOR XOR8Sum
AND #$0F
TAX
LDA ParityTable,X ;Use 4-bit reduced value to load parity bit from an array, into A.
RTS
ParityTable:
;Note the symmetry. This could reduced to less values but with increase in code size and cycles
;Removing the table entirely could be done by repeating the EOR with AND #33 then AND #11 or the like.
.byte 00 01 01 00 01 00 00 01
.byte 01 00 00 01 00 01 01 00
;2+4+2+2+1+2+1=14 bytes (ZP variables) with optional 3 bytes (if not ZP variables) of code memory
;16 bytes of unpacked parity table, either in ZP RAM or using 16-bit addresses
;33 bytes total PRG-ROM maximum
;Adds (3+1)+4(2)+(3+1)+2+2+4+6=4+8+4+8+6=30 cycles maximum to end of XOR8 call.
;By alphamule
;Assuming XOR8 result in A register.
STA XOR8Sum
ROR 4 ; ROR repeated 4 times for brevity
EOR XOR8Sum
AND #$0F
TAX
LDA ParityTable,X ;Use 4-bit reduced value to load parity bit from an array, into A.
RTS
ParityTable:
;Note the symmetry. This could reduced to less values but with increase in code size and cycles
;Removing the table entirely could be done by repeating the EOR with AND #33 then AND #11 or the like.
.byte 00 01 01 00 01 00 00 01
.byte 01 00 00 01 00 01 01 00
;2+4+2+2+1+2+1=14 bytes (ZP variables) with optional 3 bytes (if not ZP variables) of code memory
;16 bytes of unpacked parity table, either in ZP RAM or using 16-bit addresses
;33 bytes total PRG-ROM maximum
;Adds (3+1)+4(2)+(3+1)+2+2+4+6=4+8+4+8+6=30 cycles maximum to end of XOR8 call.