# Ghidra headless on SAND's `GameAssembly.dll` (IL2CPP) How to get a workable Ghidra database for the client, and the **big lesson**: for an IL2CPP binary you **inject the symbol table from Il2CppDumper** — you do *not* sit through full auto-analysis. ## Inputs (all already on disk) - Binary: `/mnt/d/SteamLibrary/steamapps/common/Sand Playtest/GameAssembly.dll` (~137 MB). - **Il2CppDumper ("yoten")**: `/mnt/c/Users/downloadpizza/Downloads/yoten/` — produces, for the current build: - `script.json` (~254 MB) — the **mapping**: `ScriptMethod[]` (Address+Name+Signature), `ScriptString[]`, `ScriptMetadata[]`, `Addresses[]` (all function starts). - `il2cpp.h` (~124 MB) — every struct/type. - `dump.cs` (~79 MB) — human-readable signatures/RVAs (mirrored to `il2cpp/dump.cs`). - Ready-made apply scripts: `ghidra.py` (names only), `ghidra_with_struct.py` (names+types+sigs), plus IDA variants. **These use an interactive `askFile()` dialog → not headless-safe as shipped.** - Ghidra 11.1.2 install: `ghidra/ghidra_install/` (`support/analyzeHeadless`). Java 17. ## The lesson: symbol-inject, not full analysis Full auto-analysis of a 137 MB IL2CPP binary is the **wrong default**: - The **Decompiler Parameter ID** analyzer is essentially single-threaded and runs over hundreds of thousands of functions. Observed: **5h21m wall / ~5h40m CPU, pegged at ~105% (one core), with the log silent for 4.5h and no checkpoint** — headless saves the project **only at the very end**, so a crash/OOM mid-run loses everything. No progress %/ETA is emitted. - It largely **rediscovers** what Il2CppDumper already knows exactly (function boundaries, names, signatures). For our targeted-decompile workflow that's wasted time. Instead: import with `-noanalysis` and run the dumper's symbol table in. You get a named, function-bounded DB in well under an hour. On-demand decompilation (`decomp_targets.py`) does its own per-function local analysis, so the global analyzers aren't needed for reading code. ### Headless-adapted applier — `ghidra/scripts/apply_il2cpp_symbols.py` Adapted from `yoten/ghidra.py`: replaced `askFile()` with a script-arg / default path. Light path — creates functions from `Addresses[]`, names them from `ScriptMethod[]`, labels string literals and metadata. **No `il2cpp.h` import, no signatures** (those need the type archive; see below). ```bash cd /home/downloadpizza/sand_tools # fresh project, import without analysis, inject symbols, save (background + log): rm -rf ghidra/project; mkdir -p ghidra/project _JAVA_OPTIONS= nohup ghidra/ghidra_install/support/analyzeHeadless ghidra/project SAND \ -import "/mnt/d/SteamLibrary/steamapps/common/Sand Playtest/GameAssembly.dll" \ -noanalysis -overwrite \ -scriptPath ghidra/scripts -postScript apply_il2cpp_symbols.py \ > ghidra/headless_symbols.log 2>&1 & # optional: pass a different script.json path as the postScript arg. ``` ### After the DB exists: targeted decompile / disasm (instant, no re-analysis) Put `rvaname` lines in `ghidra/targets.txt`, then `-process` the saved program: ```bash _JAVA_OPTIONS= ghidra/ghidra_install/support/analyzeHeadless ghidra/project SAND \ -process GameAssembly.dll -noanalysis \ -scriptPath ghidra/scripts -postScript decomp_targets.py \ > ghidra/headless.log 2>&1 # -> ghidra/decomp.c (or disasm_targets.py -> ghidra/disasm.txt) ``` `decomp_targets.py`/`disasm_targets.py` already `disassemble()`+`createFunction()` per target, so they work even on a bare `-noanalysis` import; with symbols injected they also resolve names/xrefs. ## Typed decompiles (optional, heavy) For params shown as real types (`WalkerBlueprintDto *` …) use the `ghidra_with_struct.py` path: it imports `il2cpp.h` (124 MB) into Ghidra's `DataTypeManager` via the C parser **first**, then applies `ScriptMethod` signatures. The header parse is the slow / memory-hungry step (the usual OOM culprit). Usually unnecessary — `il2cpp/dump.cs` already has every signature for reference. Only do it if you specifically need typed struct fields in the decompiler. ## Address convention (verified) Il2CppDumper `script.json` `Address` = the Ghidra **offset from image base** directly: `baseAddress.add(Address)` (image base `0x180000000`). **No `-0x1000`.** (Note: the local `ghidra/methods.tsv` index used by `reverse/resolve_decomp.py` stores `rva = scriptAddress - 0x1000` for its own bookkeeping — different thing; don't conflate.) ## Memory / gotchas - `analyzeHeadless` has `MAXMEM=8G` (already bumped). **But the shell exports `_JAVA_OPTIONS=-Xmx4g`**, which silently caps the heap at 4 GB and causes swap thrash — always prefix runs with `_JAVA_OPTIONS=` to clear it. Machine has ~11 GiB RAM. - The run is detached via `nohup` (survives the session); it is **not** in tmux/screen. Watch with `tail -f ghidra/headless_symbols.log`. `REPORT: Save succeeded` = done. - `ghidra/` is git-ignored (install + project + dumps, all large/regenerable). ## Tooling map (`reverse/`, `ghidra/scripts/`) > `ghidra/` is git-ignored, so the **tracked masters** live in `reverse/` (`ghidra_*.py`); copy them > into `ghidra/scripts/` (where `-scriptPath` points) to run. e.g. > `cp reverse/ghidra_apply_il2cpp_symbols.py ghidra/scripts/apply_il2cpp_symbols.py`. - `reverse/ghidra_apply_il2cpp_symbols.py` → `ghidra/scripts/apply_il2cpp_symbols.py` — headless symbol injector (this doc). - `ghidra/scripts/decomp_targets.py` — decompile `targets.txt` → `ghidra/decomp.c`. - `ghidra/scripts/disasm_targets.py` — disassemble `targets.txt` → `ghidra/disasm.txt` (fast, no analysis). - `reverse/il2cpp_re.py` — VA↔file-offset, method index from `dump.cs`, xrefs, body disasm + float consts. - `reverse/resolve_decomp.py` — annotate `ghidra/decomp.c` with symbol names + string literals.