If you write after the end of a buffer, you affect whatever is stored after the buffer. Unmanaged languages like assembly language and C make it easier to end up with a buffer overflow in a program than in managed languages like C# or Java, which automatically check each array access against the array's lower and upper bounds. In fact, assembly language is worse than C in this respect because it makes no type distinction between a scalar variable and the first element of an array or struct.
Ordinarly, you can detect a buffer overflow from the behavior of a program when it uses what is stored after the overflowed buffer. However, if you write after a buffer into memory that you're not using at the moment, you may not see the effect of the buffer overflow until it's too late. So one technique to detect buffer overflows is to randomize the order of things in memory, so that each buffer that can be overflowed is more likely to eventually end up before something where the effect of the overflow is visible.
So I'm proposing an extension to 6502 assembly language that introduces a new control command called .shuffle. An assembler should permute the lines between .shuffle and .endshuffle when assembling the program, so that variables end up in a different order each time. For example:
might become
It's also useful for finding overflows that fall off the end of a read-only data segment. For these, you'll want to permute chunks longer than one line, which is why the .shuffle keyword takes a delimiter argument.
A similar mechanism at the program loader level in operating systems with virtual memory has been called ASLR.
Another potential application even after you have found and fixed buffer overflows is binary fingerprinting. If you are distributing copies of a program under nondisclosure agreement, and you want to covertly mark each copy to make it traceable, you can do so by permuting the subroutines and variables in each copy.
If you like the idea, I plan to implement it as a preprocessor in Python. This would work for NESASM, CA65, ASM6, and even C compilers.
Ordinarly, you can detect a buffer overflow from the behavior of a program when it uses what is stored after the overflowed buffer. However, if you write after a buffer into memory that you're not using at the moment, you may not see the effect of the buffer overflow until it's too late. So one technique to detect buffer overflows is to randomize the order of things in memory, so that each buffer that can be overflowed is more likely to eventually end up before something where the effect of the overflow is visible.
So I'm proposing an extension to 6502 assembly language that introduces a new control command called .shuffle. An assembler should permute the lines between .shuffle and .endshuffle when assembling the program, so that variables end up in a different order each time. For example:
Code:
.shuffle
foo: .res 32
bar: .res 4
baz: .res 4
cnut: .res 32
.endshuffle
foo: .res 32
bar: .res 4
baz: .res 4
cnut: .res 32
.endshuffle
might become
Code:
cnut: .res 32
baz: .res 4
foo: .res 32
bar: .res 4
baz: .res 4
foo: .res 32
bar: .res 4
It's also useful for finding overflows that fall off the end of a read-only data segment. For these, you'll want to permute chunks longer than one line, which is why the .shuffle keyword takes a delimiter argument.
Code:
.shuffle THE_GAME
title_screen:
.byt ...
.byt ...
THE_GAME
character_menu:
.byt ...
.byt ...
THE_GAME
stage_menu:
.byt ...
.byt ...
.endshuffle
title_screen:
.byt ...
.byt ...
THE_GAME
character_menu:
.byt ...
.byt ...
THE_GAME
stage_menu:
.byt ...
.byt ...
.endshuffle
A similar mechanism at the program loader level in operating systems with virtual memory has been called ASLR.
Another potential application even after you have found and fixed buffer overflows is binary fingerprinting. If you are distributing copies of a program under nondisclosure agreement, and you want to covertly mark each copy to make it traceable, you can do so by permuting the subroutines and variables in each copy.
If you like the idea, I plan to implement it as a preprocessor in Python. This would work for NESASM, CA65, ASM6, and even C compilers.