Overall Structure and Layout
When you write a typical C application for a modern operating system, you do not need to pay very much heed to the exact layout of system memory nor spell out a great many details to the linker about how your program needs to be loaded. Decades of abstraction layers insulate you and provide a highly standardized, sanitized environment for your program's running process.
The GameBoy Advance is a bare-metal environment. There is no operating system – or the game is its own operating system, whichever viewpoint one prefers. While the idealized memory abstraction presented to an OS-managed application is a single uniform space from 0x00000000
to 0xffffffff
, the real memory space of the hardware is broken down into pieces with different functionalities and there are often large gaps that aren't actually attached to anything. The GBA memory space includes the read-only BIOS (16KB), fast working RAM (32KB), slow working RAM (256KB), memory belonging directly to the screen (98KB), and space reserved for the cartridge's ROM (32MB) and optionally either battery-backed SRAM (64KB) or flash (64KB, bankswitchable) for save storage. The working (that is, "normal") RAM is split into fast and slow regions not because the designers thought you would like slower RAM, but to balance the device's cost against performance. The variables which the game updates constantly should be squeezed into fast RAM, with the rest on the slow boat.
Note
A game cartridge can contain a "mapper" or "bankswitch" capability which connects and disconnects different chips on the cartridge board to the console's memory space dynamically. This allows the game to have more data than will fit by breaking it down into chunks that are never loaded at the same time. However, this was much more common in 8-bit and 16-bit games since 32-bit games have a much larger address space available even on systems (including the GBA) which only wire up part of it. You may be interested in my high-level overview of how mappers are used on the NES.
It is therefore necessary to divide Emerald up into sections – the read-only code and data, the fast variables, the slow variables, and the save file – and explicitly instruct the compiler on what goes where in the final ROM. This is accomplished through linker scripts. It is fortunately not necessary to understand every detail of linker script syntax for our purposes. The scripts are found in ld_script.txt and the files it includes by name.
Notice that there are two versions of the linker script in the repository: one which makes sure everything is placed exactly in the same spot they ended up in the original ROM (ld_script.txt
), and one for modern compilers which will find any new modules you've added and not be too particular about the exact order (ld_script_modern.txt
). The latter is useful if you are making substantial changes to the game. We are examining the exact-reproduction version so that what we see in the script will correlate exactly to what we see in the game's memory in a live debugger. In general, the ordering of different pieces of data within the same linking section is due to the order they were added to the code base or the alphabetical order of the original source filenames, and not because they must be in that specific order. The naming conventions used in linker scripts are extremely archaic and non-obvious: .text
means executable code, .data
means variables that always have a value and .bss
means variables that do not always have a value. .rodata
is read-only data which is, at least, actually what it sounds like.
Memory Map Summary
- Read-Only BIOS (
0x00000000
)- Boot logo routine
- Function for validating ROM header and starting game
- Generic utility functions
- Slow RAM (
0x02000000
)- 112KB reserved for the heap (the memory you get from
malloc
) - variables of files manually listed as okay for slow RAM (largely high-level game state that does not rapidly change)
- 112KB reserved for the heap (the memory you get from
- Fast RAM (
0x03000000
)- variables of files manually listed as needing fast RAM
- The stack for function-local variables is not explicitly listed in the linker script, but the game will set up the stack to point towards the end of fast RAM before calling
main
. It grows downwards, and it is the responsibility of the programmer to make sure functions can never chain so deeply that it will begin overwriting memory already in use.
- Graphics (
0x05000000
)- current color palette
- Video RAM for rendering the screen
- OAM (Object Attribute Memory, that is, sprites)
- Read-Only Memory (
0x08000000
)- ROM header
crt0
, which does some housekeeping beforemain
is calledmain
- more executable code in no special order
- Script data for the game event scripting engine
- more executable code
- data embedded in executable code (short strings, numerical constants, etc)
- songs
- even more code
- multiboot
- front sprites of each mon
- other graphics
-
Flash Memory (
0x0E000000
)- Not explicitly listed in linker scripts nor physically present in the ROM file, but two copies of the save game (so that if one is corrupt, the other might still be loadable) are kept here. Most emulators will dump it to a
.sav
file on your computer.
ROM header
The header must always be at
0x08000000
exactly so the BIOS boot function knows where to find it. It contains the game's first executable instruction (which just jumps to the entry point), its name (POKEMON EMER
), its assigned ID code (BPEE
) and its assigned publisher code (01
, which obviously stands for the most important publisher). It also contains the CRC checksum that the BIOS uses during its validity check to detect a corrupt cartridge. Since the CRC can only be calculated against a finished ROM, blank space is reserved during the compile process and a command-line utility is used to patch in the correct value at the end of compiling. You can see the reservation of space in the ROM for the header in rom_header.s, though it's not very exciting, and the patching happens in the Makefile.The header wastes 156 bytes on including the same corporate logo in every single cartridge of every single game for silly legal reasons. The game will not boot if the logo in the game does not exactly match the logo in the BIOS, and (or so the reasoning went) if you include a copy of the logo without buying a license to do so, then you are opening yourself up to being sued off the face of the earth. Everyone has ignored this for over twenty years.
There is a second header,
GFRomHeader
, which is particular to this game engine. It enumerates where to find various information from the save file or about this particular version of the game so that a host application (such as Colosseum on GameCube) can safely extract them from the cartridge. It also serves as handy mini-documentation for what keyword to search to find many things you may have been curious about: - Not explicitly listed in linker scripts nor physically present in the ROM file, but two copies of the save game (so that if one is corrupt, the other might still be loadable) are kept here. Most emulators will dump it to a
rom_header_gf.c
static const struct GFRomHeader sGFRomHeader = {
.version = GAME_VERSION,
.language = GAME_LANGUAGE,
.gameName = "pokemon emerald version",
.monFrontPics = gMonFrontPicTable,
.monBackPics = gMonBackPicTable,
.monNormalPalettes = gMonPaletteTable,
.monShinyPalettes = gMonShinyPaletteTable,
.monIcons = gMonIconTable,
.monIconPaletteIds = gMonIconPaletteIndices,
.monIconPalettes = gMonIconPaletteTable,
.monSpeciesNames = gSpeciesNames,
.moveNames = gMoveNames,
.decorations = gDecorations,
.flagsOffset = offsetof(struct SaveBlock1, flags),
.varsOffset = offsetof(struct SaveBlock1, vars),
.pokedexOffset = offsetof(struct SaveBlock2, pokedex),
.seen1Offset = offsetof(struct SaveBlock1, seen1),
.seen2Offset = offsetof(struct SaveBlock1, seen2),
.pokedexVar = VAR_NATIONAL_DEX - VARS_START,
.pokedexFlag = FLAG_RECEIVED_POKEDEX_FROM_BIRCH,
.mysteryEventFlag = FLAG_SYS_MYSTERY_EVENT_ENABLE,
.pokedexCount = NATIONAL_DEX_COUNT,
.playerNameLength = PLAYER_NAME_LENGTH,
.trainerNameLength = TRAINER_NAME_LENGTH,
.pokemonNameLength1 = POKEMON_NAME_LENGTH,
.pokemonNameLength2 = POKEMON_NAME_LENGTH,
// Two of the below 12s are likely move/ability name length, given their presence in this header
.unk5 = 12,
.unk6 = 12,
.unk7 = 6,
.unk8 = 12,
.unk9 = 6,
.unk10 = 16,
.unk11 = 18,
.unk12 = 12,
.unk13 = 15,
.unk14 = 11,
.unk15 = 1,
.unk16 = 8,
.unk17 = 12,
.saveBlock2Size = sizeof(struct SaveBlock2),
.saveBlock1Size = sizeof(struct SaveBlock1),
.partyCountOffset = offsetof(struct SaveBlock1, playerPartyCount),
.partyOffset = offsetof(struct SaveBlock1, playerParty),
.warpFlagsOffset = offsetof(struct SaveBlock2, specialSaveWarpFlags),
.trainerIdOffset = offsetof(struct SaveBlock2, playerTrainerId),
.playerNameOffset = offsetof(struct SaveBlock2, playerName),
.playerGenderOffset = offsetof(struct SaveBlock2, playerGender),
.frontierStatusOffset = offsetof(struct SaveBlock2, frontier.challengeStatus),
.frontierStatusOffset2 = offsetof(struct SaveBlock2, frontier.challengeStatus),
.externalEventFlagsOffset = offsetof(struct SaveBlock1, externalEventFlags),
.externalEventDataOffset = offsetof(struct SaveBlock1, externalEventData),
.unk18 = 0x00000000,
.speciesInfo = gSpeciesInfo,
.abilityNames = gAbilityNames,
.abilityDescriptions = gAbilityDescriptionPointers,
.items = gItems,
.moves = gBattleMoves,
.ballGfx = gBallSpriteSheets,
.ballPalettes = gBallSpritePalettes,
.gcnLinkFlagsOffset = offsetof(struct SaveBlock2, gcnLinkFlags),
.gameClearFlag = FLAG_SYS_GAME_CLEAR,
.ribbonFlag = FLAG_SYS_RIBBON_GET,
.bagCountItems = BAG_ITEMS_COUNT,
.bagCountKeyItems = BAG_KEYITEMS_COUNT,
.bagCountPokeballs = BAG_POKEBALLS_COUNT,
.bagCountTMHMs = BAG_TMHM_COUNT,
.bagCountBerries = BAG_BERRIES_COUNT,
.pcItemsCount = PC_ITEMS_COUNT,
.pcItemsOffset = offsetof(struct SaveBlock1, pcItems),
.giftRibbonsOffset = offsetof(struct SaveBlock1, giftRibbons),
.enigmaBerryOffset = offsetof(struct SaveBlock1, enigmaBerry),
.enigmaBerrySize = sizeof(struct EnigmaBerry),
.moveDescriptions = NULL,
.unk20 = 0x00000000, // 0xFFFFFFFF in FRLG
};
It is very curious that the game name is in all lower-case.
The variables marked as "unknown" in the decomp have the following names in the original code:
// unknowns 5-17
WAZA_NAME_SIZE,
ITEM_NAME_SIZE,
SEED_NAME_SIZE,
SPEABI_NAME_SIZE,
ZOKUSEI_NAME_SIZE,
MAPNAME_WIDTH,
MAPNAME_MAX,
TRTYPE_NAME_SIZE,
GOODS_NAME_SIZE,
ZUKAN_TYPE_SIZE,
EOM_SIZE,
BTL_TR_NAME_SIZE,
KAIWA_WORK_SIZE,
// unknown 18
0
// unknown 20
static const u8 IndexNull[ 0x100 - sizeof(POKEMON_ROM_HEADER)] = {};
Unknown 18 will take its secrets to the end of time, it seems. It's in a set of interoperability flags, but no name or explanation is given. Unknown 20 is not actually part of the structure in the original source code, but simply follows after it. It was intended as padding to reserve a full 256 bytes, but its existence is pointless as the structure is 344 bytes. The ancient compiler the original team was using must have decided that an array with a negative length (256 - 344) actually has a length of one rather than throwing an error, which would have brought it to their attention as no longer needed.
Source Directory Layout
Most of the C source files that make up the bulk of the engine are in src/
but there are a few outside of it (gflib/
) and assembly source files (.s
) are scattered across several folders. asm/
contains macro definitions for the scripting engine; data/
contains an enormous amount of assorted high-level game data such as event scripts, in-game text, tilesets and maps; graphics/
contains visuals other than tilesets; include/
is C headers; sound/
includes songs and sound effects; tools/
contains utilities used in the build process that are not part of the game itself. Here you will find the scripts that convert the easily-editable forms of images and music stored in the repo to the formats expected by the game and gbafix
which patches the final CRC into the header.
If you add new data resources to the game, you will likely need to wade into src/data/
and find the right header to patch to actually incorporate it into the ROM by name.
Continue on to Init and Main