Wow. My engine for my project, which supports music and sound effects but is very compact as I don't use PRG bankswitching, is exactly $358 bytes, 856 bytes in decimal. I wonder how you got one who is so small. I was under the impression that my was one of the smallest you could do.
However, if yours is smaller but takes larger songs data for an equivalent song, then you won't gain anything.
I have one byte per note (plus one byte if crossing an octave boundary). I also have "subroutine calls" that takes two bytes, and a repeat command which takes 3 bytes, which removes the need to repeat data in a song in most cases. That way I can keep most songs in something like 200 bytes (although it's very arbitrary).
EDIT : I just investigated more about the ROM usage of varying things in my project. WLA-DX is supposed to tell it to me with the .block bug it's bugged and doesn't print the correct numbers
What takes the palm of the most ROM eating stuff is sprite definitions ! I have barly done half of the sprites I'd like to see in the game, and this eats up about 7200 bytes !! This is about 12% of the total ROM space I want to use, with the sprites not already done it will take at least 20%. I might have compress those somehow if I want to use those 32k efficiently.
In comparison, I have a good half of songs and many sound effects done, which takes "only" 2800 bytes (about 3500 with the engine).