In an attempt to mitigate slowdown, I'm planning on writing a profiling tool to analyze FCEUX trace logs for a ROM hacking project I'm working on. Hopefully, with a little extra effort, it can be made useful for others as well. Before I start coding anything, I figured I'd check if anyone saw any glaring omissions in terms of features or considerations, or if anyone had any other ideas or suggestions.
I looked around and didn't see much along these lines out there. It appears that NESICIDE has a built in profiler. If anyone has any insight into if/how it can be used for preexisting games and homebrews, and how its feature set compares to what I've proposed below, I'd certainly be curious to hear.
I've basically outlined my thoughts to get them organized before I start designing anything.
I looked around and didn't see much along these lines out there. It appears that NESICIDE has a built in profiler. If anyone has any insight into if/how it can be used for preexisting games and homebrews, and how its feature set compares to what I've proposed below, I'd certainly be curious to hear.
I've basically outlined my thoughts to get them organized before I start designing anything.
Code:
- Will parse and analyze FCEUX trace logs
- Design to permit eventual extension to support other formats
- Detect bank swaps to allow CPU address space to be converted to ROM locations
- Support for NROM/MMC1/MMC3
- Design to permit eventual extension to support other mappers
- Possibly infer initial bank configuration by comparing logged instructions with contents of ROM
- Detect vblank via NMI vector (manual entry or from ROM)
- Detect CPU spinning (waiting for vblank)
- Via manual entry of wait loop location
- Possibly by a simple heuristic such as matching the pattern:
Loop: [read] PPU_Status
[branch] Loop
- Detect lag frames
- Via re-entrant NMI
- Via NMI occurring prior to wait spin
- Option to exclude lagged frames or unlagged frames
- i.e. one could compare separate profiles for game loop iterations that lag and those that do not
- Generate code/data log, ROM heat map, and RAM heat map
- Identify "routines" and calculate statistics
- Routines delimited by JSR targets
- Additionally delimited by jump table targets
- Requires manual entry of jump table routine location
- Expected jump table usage: "jsr DoJumpTable" followed by a list of pointers
- Other jump table implementations to consider?
- Provide stats for:
- Total time/average time/longest time in routine, including invoked subroutines
- Total time/average time/longest time in routine, not including invoked subroutines
- Total invocations
- Most invocations per frame / average number of invocations per frame
- Anything else?
- Support for importing/defining/exporting symbols
- Design to permit eventual extension to support other formats
- Detect bank swaps to allow CPU address space to be converted to ROM locations
- Support for NROM/MMC1/MMC3
- Design to permit eventual extension to support other mappers
- Possibly infer initial bank configuration by comparing logged instructions with contents of ROM
- Detect vblank via NMI vector (manual entry or from ROM)
- Detect CPU spinning (waiting for vblank)
- Via manual entry of wait loop location
- Possibly by a simple heuristic such as matching the pattern:
Loop: [read] PPU_Status
[branch] Loop
- Detect lag frames
- Via re-entrant NMI
- Via NMI occurring prior to wait spin
- Option to exclude lagged frames or unlagged frames
- i.e. one could compare separate profiles for game loop iterations that lag and those that do not
- Generate code/data log, ROM heat map, and RAM heat map
- Identify "routines" and calculate statistics
- Routines delimited by JSR targets
- Additionally delimited by jump table targets
- Requires manual entry of jump table routine location
- Expected jump table usage: "jsr DoJumpTable" followed by a list of pointers
- Other jump table implementations to consider?
- Provide stats for:
- Total time/average time/longest time in routine, including invoked subroutines
- Total time/average time/longest time in routine, not including invoked subroutines
- Total invocations
- Most invocations per frame / average number of invocations per frame
- Anything else?
- Support for importing/defining/exporting symbols