There is a plugin for Genesis rom disassembly, But What about a plugin for NES, FDS and maybe SNES Roms. That way it can decode any old nintendo rom without figureing what to disassemble (so Code can be split from data automatically!)
If anyone wants to base this idea: Go to GSHI's main download page and look for mIDA Genesis plugin.
Just a interesting thought. I could pull this off, But wanted a suggestion
I think you're better off building a time machine, and an invisibility suit, travel back in time, copy the source code, and return to the present. Now I could do this but I can't be bothered to do so because I'm busy eating cake.
Run FCEUXD and play the hell out of the game, its Code-Data logger will tell you what's code and what's data.
Dwedit wrote:
Run FCEUXD and play the hell out of the game, its Code-Data logger will tell you what's code and what's data.
The Code Data Logger could not support FDS games. I will use it for a while on a NES game though.
Or, you know, you could just not disassemble the character rom part of the .NES file, assuming that part exists. As for the other two formats, you're on your own figuring out what's what. Which is kinda the reason why code/data loggers exist in the first place.
Quote:
That way it can decode any old nintendo rom without figureing what to disassemble (so Code can be split from data automatically!)
Nope - IDA is not a miracle. The main and the only (?) obstacle is jump tables, like jmp(oper) or so. So, maybe in usual game you'll get one or two subroutines deep and that's it. The other problem is mappers and bank switching - actually, you'll need a separate loader for each mapper type. As for mIDA (which is BTW is merely a script - not a loader), it can do some... things thanks to sega's hardware properties.
But, anyhow, CaH4e3 already made a basic l
oader for .nes format ROMs for a few primal mappers. There is also a source code in archive, but you'll need an IDA SDK to futher development of loader, though it would be great if someone will continue the project. Cause CaH4e3 doesn't wanna.
btw, IDA will not support SNES at all since it has no support for 65816 CPU (so someone would have to write 65816 disassembly core before it can do SNES)
Hey guys, thought you would like this:
http://www.openrce.org/repositories/use ... ackage.zip
I hope you'll find it useful.
Dennis wrote:
Hey guys, thought you would like this:
http://www.openrce.org/repositories/use ... ackage.zipI hope you'll find it useful.
Thanks for posting this. I remember you talking about this in IRC quite a while ago but never tried it until now.
MottZilla wrote:
I think you're better off building a time machine, and an invisibility suit, travel back in time, copy the source code, and return to the present. Now I could do this but I can't be bothered to do so because I'm busy eating cake.
IDA Pro cannot be used for 65816 (SNES/SFC) disassembly. A friend of mine Tony Allowatt (a.k.a. Flobby, now professor at VT, and author of a bunch of romhacking tools) tried this many years ago. IDA Pro doesn't support dynamically-sized registers (e.g. m=0 vs. m=1, x=0 vs. x=1).
For 6502 disassembly, yes, it already has support for that. You can use it for projects if you desire, but getting it to "look nice" with NES code is a serious undertaking.
Hi,
thefox wrote:
Thanks for posting this. I remember you talking about this in IRC quite a while ago but never tried it until now.
yes, I remember our chat, quite a while indeed
koitsu wrote:
IDA Pro cannot be used for 65816 (SNES/SFC) disassembly. A friend of mine Tony Allowatt (a.k.a. Flobby, now professor at VT, and author of a bunch of romhacking tools) tried this many years ago. IDA Pro doesn't support dynamically-sized registers (e.g. m=0 vs. m=1, x=0 vs. x=1).
I'm not sure whether this still is the case. With IDA you can define custom datatypes now (
http://www.hexblog.com/?p=117). On the other hand, I haven't looked at 65816 disassembly/datasheets at all, though I'd really, really love to have a complete disassembly of Super Metroid
koitsu wrote:
For 6502 disassembly, yes, it already has support for that. You can use it for projects if you desire, but getting it to "look nice" with NES code is a serious undertaking.
This is what the collection of loaders and plugins that I posted a link to in my previous post aims at: Trying to make the disassembly look nice and readable. Admittedly, the 'bankswitch' plugin is probably not the best solution and might need an overhaul. However, I really enjoyed having the ability to statically enhance a ROM's disassembly, then export its symbols via the 'MadNES' plugin to be able to use them with FCEUXD SP's ability to have symbolic debugging. If you feel there's something missing, please let me know, I'm willing to update the plugins/loader.
Dennis wrote:
If you feel there's something missing, please let me know, I'm willing to update the plugins/loader.
Ability to load FCEUX code/data logs? You probably know better if that's feasible with IDA's plugin interface. Maybe even an IDC script would suffice.
thefox wrote:
Ability to load FCEUX code/data logs? You probably know better if that's feasible with IDA's plugin interface. Maybe even an IDC script would suffice.
I've just looked into fceuxd's code/data logging feature and documentation but I'm not sure whether I'm missing something here. As far as I understand, IDA nicely separates code from data (also, 'verified' by having had a look at a disassembly of a ROM). What extra information would the code/data logger extend the disassembly with? Writing a cdl parser in IDC/IDAPython should be relatively easy, but I'm not sure whether it is really needed. In case I'm missing something here I'd be glad to hear about it!
Cheers,
Dennis
Dennis wrote:
I've just looked into fceuxd's code/data logging feature and documentation but I'm not sure whether I'm missing something here. As far as I understand, IDA nicely separates code from data (also, 'verified' by having had a look at a disassembly of a ROM). What extra information would the code/data logger extend the disassembly with? Writing a cdl parser in IDC/IDAPython should be relatively easy, but I'm not sure whether it is really needed. In case I'm missing something here I'd be glad to hear about it!
As intelligent as IDA is, there are still times when it incorrectly detects code as data or vice versa. Using CDL could fix the remaining 5-10% (?) without manual work.
Another small suggestion, you're using names such as "PPU_CR_1" for the IO regs, it might be better to use the names from the NESDev Wiki instead, I think they're more common:
http://wiki.nesdev.com/w/index.php/PPU_registers
thefox wrote:
As intelligent as IDA is, there are still times when it incorrectly detects code as data or vice versa. Using CDL could fix the remaining 5-10% (?) without manual work.
Another small suggestion, you're using names such as "PPU_CR_1" for the IO regs, it might be better to use the names from the NESDev Wiki instead, I think they're more common:
http://wiki.nesdev.com/w/index.php/PPU_registers
Thanks for your suggestions. Quick question: How do you know from a CDL file which prg page a certain logged byte does belong to?
In a CDL file, each byte corresponds to a byte in the PRG.
Code:
xPdcAADC
C = Whether it was accessed as code.
D = Whether it was accessed as data.
AA = Into which ROM bank it was mapped when last accessed:
00 = $8000-$9FFF 01 = $A000-$BFFF
10 = $C000-$DFFF 11 = $E000-$FFFF
c = Whether indirectly accessed as code.
(e.g. as the destination of a JMP ($nnnn) instruction)
d = Whether indirectly accessed as data.
(e.g. as the destination of an LDA ($nn),Y instruction)
P = If logged as PCM audio data.
x = unused.
Dennis wrote:
Thanks for your suggestions. Quick question: How do you know from a CDL file which prg page a certain logged byte does belong to?
The format is documented in the FCEUX (.chm) help file.
Code:
xPdcAADC
C = Whether it was accessed as code.
D = Whether it was accessed as data.
AA = Into which ROM bank it was mapped when last accessed:
00 = $8000-$9FFF 01 = $A000-$BFFF
10 = $C000-$DFFF 11 = $E000-$FFFF
c = Whether indirectly accessed as code.
(e.g. as the destination of a JMP ($nnnn) instruction)
d = Whether indirectly accessed as data.
(e.g. as the destination of an LDA ($nn),Y instruction)
P = If logged as PCM audio data.
x = unused.
O.K., I've gotta try to sort things a bit, I haven't really looked at NES stuff for years: an NES ROM may contain several pages that may be mapped to different banks and may be remapped during execution depending on the mapper used by a ROM, right? So if this is correct (don't remember exactly how the swapping mechanism works), a CDL file still does not contain any information about a byte's origin, right? Say if according to a CDL file the byte at 0x8000 has been marked as code, how do you know which of the ROM's page it corresponds to (eventually the rom banks may have been switched/swapped, right?)?
Thanks for clarifying.
Repeating post:
Code:
xPdcAADC
AA = Into which ROM bank it was mapped when last accessed:
00 = $8000-$9FFF 01 = $A000-$BFFF
10 = $C000-$DFFF 11 = $E000-$FFFF
To clarify what Dwedit said: There is one CDL byte for each byte in the ROM, not for each byte in CPU address space.
ah, misread that part, thanks. I'll see if a CDL parser can be reasonably integrated. Maybe I'll change the NES loader module and deprecate the bankswitch plugin.
koitsu wrote:
IDA Pro cannot be used for 65816 (SNES/SFC) disassembly. A friend of mine Tony Allowatt (a.k.a. Flobby, now professor at VT, and author of a bunch of romhacking tools) tried this many years ago. IDA Pro doesn't support dynamically-sized registers (e.g. m=0 vs. m=1, x=0 vs. x=1
).
You don't need to understand french, just click on "télécharger".
Didn't test it much, but it works for the most part...
here's a screenshot of the (still unfinished) plugin:
http://imageshack.us/f/15/cdlviewer.png/
I am a bit unlucky with the current mapping of both data and disassembly. Because for now, each byte of a PRG bank is being mapped to the address taken from the CDL file. So in the end this might lead to conflicts, say if several bytes (from different banks) are being mapped to the same addresses. Any ideas/suggestions?
IMHO, the CDL file format is lacking time stamp information for each byte in a CDL file. Otherwise, how would you be able to tell when an arbitrary byte of a PRG page has been mapped to a particular ROM bank?
I.e. say if both the byte at offset 0 of PRG bank 0 and PRG bank 1 are being mapped to 0x8000 at some point in time (due to a bank swap), the CDL file will save that information on the one hand, on the other hand it's missing any time stamp information. This is why I think you can't really improve a disassembly with the aid of a CDL file.
Dennis wrote:
IMHO, the CDL file format is lacking time stamp information for each byte in a CDL file. Otherwise, how would you be able to tell when an arbitrary byte of a PRG page has been mapped to a particular ROM bank?
I.e. say if both the byte at offset 0 of PRG bank 0 and PRG bank 1 are being mapped to 0x8000 at some point in time (due to a bank swap), the CDL file will save that information on the one hand, on the other hand it's missing any time stamp information. This is why I think you can't really improve a disassembly with the aid of a CDL file.
Yeah I guess you're right. We need some kind of extended CDL for that... for each (4K or 8K) PRG bank it should have info about what addresses the bank has been mapped at.
I will look into adding "extended CDL" generation to Nintendulator later.
EDIT: Did you actually find some games that map the same banks to different addresses at some point in time?
thefox wrote:
EDIT: Did you actually find some games that map the same banks to different addresses at some point in time?
I've just been testing my code on an MMC1 mapper game: Metroid.
As soon as you leave the game-intro and start playing the game, a bank switch seems to occur - I've monitored a change of the bank's memory content at 0x8000. The result is a CDL file which contains information of at least two different bytes / file offsets that are being mapped to 0x8000+.
Later, when (sequentially) parsing the CDL file using an external CDL-processor, you are not able to tell which byte to map to 0x8000+ but can only wrongly assume that it is the last occurence recorded in a CDL file.
Each byte in the CDL corresponds to one byte in the ROM, not one byte in the PRG address space. If two different banks are switched into $8000-$BFFF, there will be a byte for the use of one byte switched into $8000 and a byte for the use of the other byte switched into $8000. And each byte of the CDL has bits A14-A13 of the address from which it was last read ($80-$9F, $A0-$BF, $C0-$DF, or $E0-$FF/$60-$7F).
Or are you talking about swapping the same bank into $8000-$9FFF and then $A000-$BFFF and accessing them in different ways each time? Please illustrate your ambiguous case with an example.
tepples wrote:
Each byte in the CDL corresponds to one byte in the ROM, not one byte in the PRG address space. If two different banks are switched into $8000-$BFFF, there will be a byte for the use of one byte switched into $8000 and a byte for the use of the other byte switched into $8000. And each byte of the CDL has bits A14-A13 of the address from which it was last read ($80-$9F, $A0-$BF, $C0-$DF, or $E0-$FF/$60-$7F).
Yes, that's the way I understood it. What I am still trying to do is to rebuild the address space layout from a CDL file.
I'm extracting the ROM bank address (base) of each byte in a CDL file
Code:
def DecodeFlags(self, b):
isCode = b & 1
isData = (b & 2) >> 1
base = (((b & 0xC) >> 2) * 0x2000) + 0x8000
isIndCode = (b & 0x10) >> 4
isIndData = (b & 0x20) >> 5
isPCMData = (b & 0x40) >> 6
return (isCode, isData, base, isIndCode, isIndData, isPCMData)
then I'm getting each original byte from the respective PRG bank at each (file) offset and am calculating the destination that the byte is supposed to be mapped to.
basically this is:
Code:
destination = base + (fileoffset % 0x2000)
*destination = original_byte
Dennis wrote:
Later, when (sequentially) parsing the CDL file using an external CDL-processor, you are not able to tell which byte to map to 0x8000+ but can only wrongly assume that it is the last occurence recorded in a CDL file.
It's inevitable that several bytes will be mapped to 8000+, that's what bankswitching does. You need to figure out a way to have the same address for multiple PRG locations in IDA... I remember hearing/trying it should be possible (tinker with the segments or something?).
thefox wrote:
It's inevitable that several bytes will be mapped to 8000+, that's what bankswitching does. You need to figure out a way to have the same address for multiple PRG locations in IDA... I remember hearing/trying it should be possible (tinker with the segments or something?).
Yes, different segments might work, but if I'm not mistaken, you'd still be missing a time stamp then.
One other option I was thinking of was loading the whole banks into one single segment, neglecting segment registers but then a 1:1 cdl:prg mapping would be possible at least. But the m6502 uses absolute addressing, right?
Here's what I'd do:
- Determine the size of the ROM's banks from the mapper.
- For each bank, create a segment, e.g. "ROM0" through "ROM15" for UOROM or SNROM.
- For each bank, find what address it was most likely mapped in, based on the A14-A13 field of the majority of bytes in that bank, and place the bank's segment at that address.
Are you familiar with the term "overlay"?
Dennis wrote:
Yes, different segments might work, but if I'm not mistaken, you'd still be missing a time stamp then.
One other option I was thinking of was loading the whole banks into one single segment, neglecting segment registers but then a 1:1 cdl:prg mapping would be possible at least. But the m6502 uses absolute addressing, right?
Are you talking about the fact, that in the case of code like this...
Code:
JSR $8123
...IDA wouldn't know which segment 8123 refers to?
Yeah, this can't be handled with CDL alone... we'd need some other kind of log which tells what bank was mapped at 8000-xxxx at the time of the JSR.
tepples wrote:
Here's what I'd do:
- Determine the size of the ROM's banks from the mapper.
- For each bank, create a segment, e.g. "ROM0" through "ROM15" for UOROM or SNROM.
- For each bank, find what address it was most likely mapped in, based on the A14-A13 field of the majority of bytes in that bank, and place the bank's segment at that address.
Are you familiar with the term "overlay"?
Thanks for your suggestions. It still involves too much of heuristics for my taste.
thefox wrote:
Are you talking about the fact, that in the case of code like this...
Code:
JSR $8123
...IDA wouldn't know which segment 8123 refers to?
Yeah, this can't be handled with CDL alone... we'd need some other kind of log which tells what bank was mapped at 8000-xxxx at the time of the JSR.
Yes, exactly! I think I'll just change the current IDA script into a CDL file viewer which supports highlighting, allows the user to interactively switch banks and apply CDL information to the disassembly (similar to the "bankswitch" plugin that comes with nespackage.zip).
Thanks a lot for your help and feedback so far, keep it coming
thefox: one more idea for an extended CDL file format:
as a flags byte from a CDL file has got a spare, unused bit: what about using it for determining whether the underlying byte is the beginning of an instruction? if the underlying byte has the "code" or "indirect code" flags set, set the new bit if the instruction is the beginning of an instruction, otherwise clear the flag. this will make it easier for subsequent attempts in correctly disassembling the code (i.e. where to start disassembling).
Dennis wrote:
thefox: one more idea for an extended CDL file format:
as a flags byte from a CDL file has got a spare, unused bit: what about using it for determining whether the underlying byte is the beginning of an instruction? if the underlying byte has the "code" or "indirect code" flags set, set the new bit if the instruction is the beginning of an instruction, otherwise clear the flag. this will make it easier for subsequent attempts in correctly disassembling the code (i.e. where to start disassembling).
Yeah sounds like a good idea.
I'm trying to come up with other improvements as well... for example for instructions such as LDA abs,x it might be useful to keep track of the possible values of X (so abs can be marked as a table with known size). If you or anybody has any ideas for that lmk.
I hope at some point this can be turned into a tool that allows disassembling bankswitching ROMs in IDA with as little manual work as possible.
thefox wrote:
I hope at some point this can be turned into a tool that allows disassembling bankswitching ROMs in IDA with as little manual work as possible.
personally, I'd prefer having to manually improve the disassembly rather than having to play through the whole game in order to achieve a complete coverage, but these new CDL features would help instructing IDA where to begin disassembling and might even help in determining when bank switching happens. I'm trying to think of more features. Looking forward to eCDL