GCC related things

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
GCC related things
by on (#236863)
Hopefully this thread will shed some light for those who seek or try to debug "obscure GCC behaviour" . There's a ton of things I have tested / coded on the Nintendo DS as of the past 6 years or so and the environment used is entirely GNU GCC (having written linkers, makefiles, filesystem drivers, video, etc). There is missing some stuff but the NintendoDS is well documented.


"The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain and the standard compiler for most Unix-like operating systems. The Free Software Foundation distributes GCC under the GNU General Public License."

I am strictly speaking of a closed source, somewhat limited embedded environment and the GCC tools required to be able to run GCC ARM/C/C++ code compiled to such platform.

Having said that, most open source NintendoDS homebrew rely on such tools.

In an embedded development platform like the NintendoDS the GCC tools are used a lot (gcc compilers, linkers), the libraries are used a lot (libc, libstdc++, libgcc, libsupc++) and some libraries used not so often (libiberty of which code was absorbed by POSIX standards, thus going into libc libraries).


The GCC linker:
The GCC linker works like any other linker (you compile object files from an ABI compliant compiler, the linker must support that exact ABI so it understands the object format) and then the linker builds a binary using your "physical addressed map file" (or linker script) and places together these object files. From there you can adjust in pretty much every way possible the output binary by customizing what objects are added or whatnot.

The GCC linker builds a binary of which, it is segmented into "sections":

Common Sections:
data, rodata, text, bss both (required by both C/C++):
.gnu.linkonce.armexidx.
.gnu.linkonce.b.
.gnu.linkonce.r.
.gnu.linkonce.t
.gnu.linkonce.d

ARM Specific:
.ARM.exidx

Also, there's also arguments you can fed into the linker (or what GCC devs call, feed arguments from GCC into the linker):

-Wl,--gc-sections,-Map,$(MAPFILE) : discard any sections the linker may find useless and output a readable map file of exactly the binary you built


-Wl,-z,defs : will catch the problematic cases involving underlinking (which has been a problem where external source code compiled from objects may not be added by the linker and excluded, causing exceptions because there is not code to execute in that section!)


And my findings:
The toolchain I have built "ToolchainGenericDS" by adding my own code and using a small subset of the already available NintendoDS open source code, will generate binaries through ndstool. (which is, an ARM7 binary + ARM9 binary, of which a header then is appended and tells where the ARM7 is located and where it should go to, same for ARM9), then a binary is packaged together. I also build newlib for NDS and the output is a .a librarian format library. Then the ToolchainGenericDS layer which ends up in the same .a librarian format library. And then the TGDS environment (linkers, makefiles, project template) will build C/C++ code from source code (.S,.s,.c,.cpp) into objects. And then the linker will look for TGDS environment inside these libraries (NDS specific API), and then, the standard POSIX + newlib calls, so, if any of these are called, these end up built into the NDS executable).

Thing is, in lieu of saving binary space, because the NDS has scarse 4MB of ram, there's a GCC linker flag --gc-sections that helps, but it will sometimes discard sections incorrectly, and thus, removing segments of code from the codepath. The "codepath" detection in the GCC linker is somewhat bugged. Let me explain:

(https://elinux.org/images/2/2d/ELC2010- ... asenko.pdf refer to page 6 to see a "graphical" description of what I mean)

Once you told the linker through the above flag to optimize and discard dead code, you will now find inside the readable map file a "Discarded input sections" section. If you keep the binary file small there's bout 80% chance the NDS binary will work correctly, that is assuming little was discarded because the codepath was simpler to deal with. But, if you add more source code and there's a lot more source code built around, the section relocatable code (of which the codepath/Program Counter will traverse later on) must be inspected carefully, because critical sections will be stubbed out! Watch out for that.

The linker will happily generate binaries, but if you have no debugger at hand, your program may have undefined behaviour or will behave incorrectly, such as throwing segfaults randomly or simply not run and be stuck running in possibly some weird place (such as the reset vectors).

a)
So to save space (meaning having --gc-sections passed onto the linker) and at the same time prevent the linker from tossing invaluable section data, you must define manually the KEEP() attribute on the section desired in your linker script.

b)
There's a magic section approach in the PDF I listed earlier to get a hold of what's going on. It addresses b) issue somewhat but not entirely.
The trick to help the linker to decide "better" which code resides, or is removed, is to turn "non-relocatable code" into "relocatable code" before it goes inside the librarian format. So then the linker will treat the object as relocatable and may work around it if subsequently, there IS code that was optimized and shortened or adultered.



So:
I had this kind of "bug" for ages, and sometimes this was resolved by hand-picking the sections the linker must NOT discard. ( a) approach), but then I guessed why depending on code density, some stuff could end broken/missing suddenly, and if size were to change, it'd magically work. This is a bug in the way GCC handles files. The linker should really add a warning like

"the librarian objects are NON relocatable, you will face weird bugs if you decide to link against it, if so then you desire to use optimized code, your code will break".

So always, always if you decide to link against .a libraries, make sure the code in there is relocatable/Position Independent (-fpic flag). That way if the Linker has no idea what to do with an object that may have been hampered/optimized/etc, then at least the linker will relocate that code which has been adapted to it.
Re: GCC related things
by on (#237701)
Can confirm rebuilding ToolchainGenericDS (https://coto88.bitbucket.io/) using the exact toolchain setup (because of how I built TGDS so I can swap GCC toolchains with ease between any version), here's a neat result someone else already mentioned:

TGDS: (source code: https://bitbucket.org/Coto88/newlib-nds-deprecated/src/master/)
- ARM none EABI: GCC 6.2.x , C++ version: 7.2.1

vs

TGDS: (source code: https://bitbucket.org/Coto88/newlib-nds/src/master/)
- ARM EABI: GCC 4.9.2 , C++ version: 4.9.2

ARM none EABI is about 30% slower in everything (SnemulDS included) as opposed to ARM EABI.


There's absolutely no reason to go back from ARM EABI to ARM none EABI. Perhaps newer processors may benefit but older processors such as AMR7/ARM9 absolutely not.


I can post SnemulDS binaries being compiled for each GCC platform if someone wants to.