- Full auto-analysis of the 137MB IL2CPP GameAssembly.dll is the wrong default: Decompiler Parameter ID is ~single-threaded, ran 5h+ with no checkpoint/ETA and saves only at the end. It rediscovers what Il2CppDumper already knows. - Add ghidra/scripts/apply_il2cpp_symbols.py: headless-adapted port of yoten/ghidra.py (askFile -> script arg) that imports the dumper's script.json symbol table (function boundaries + names + string/metadata labels) onto a -noanalysis import. Names-only/light path; struct+signature path documented. - docs/GHIDRA.md: full workflow, address convention (base.add(Address), no -0x1000), the _JAVA_OPTIONS=-Xmx4g heap-cap gotcha, targeted decomp/disasm commands.
164 lines
13 KiB
Markdown
164 lines
13 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
## What this repo is
|
||
|
||
Reverse-engineering / data-mining toolkit for the game **SAND** (Hologryph, Unity 6000.0.40f1,
|
||
IL2CPP). It extracts server-authoritative and static game data from three sources, and reads/writes
|
||
the game's local walker save files. There is no application to build — everything is standalone
|
||
Python scripts run against game files, network captures, or live servers.
|
||
|
||
The four data sources, and which tools own them:
|
||
- **Unity asset bundles** (static config: items, recipes, loot, islands) → `bundle/`
|
||
- **Master server** `wss://<region>.hologryph.com/gameclient/` (economy, walker blueprints, research) → `reverse/master_scrape.py`
|
||
- **PlayFab** (Azure; **auth-only** for this title — Economy/catalog disabled) → `reverse/playfab_scrape.py`
|
||
- **`.wbt` walker save files** (local, on disk) → `walker/`
|
||
|
||
## Working rules (from operator memory — follow these)
|
||
|
||
- **Data only, never heuristics.** Do not invent rules or fill gaps with plausible assumptions.
|
||
Derive every value from game files, decompiled code, or captured payloads — or ask. (An invented
|
||
"guns point outward" rule and a guessed rotation→facing mapping both produced wrong results.)
|
||
- **Report what the data shows, not inferences as fact.** Don't jump to conclusions.
|
||
- **No polling wait-loops.** Use background tasks + wakeup notifications; don't `sleep`-poll for completion.
|
||
- **Don't hammer the live server.** It is a real playtest backend. Warn the operator *before* any
|
||
action that makes repeated/abnormal connections. BattlEye is active in the game — all scraping is
|
||
done **outside** the game process (replayed protocol / captures / REST), never via injection.
|
||
- **A `/connect` scrape kicks the live player** (single session per account, newest wins — verified
|
||
2026-06-16, see `docs/MASTER_SERVER.md`). Don't open `/connect` while the operator is in-game.
|
||
|
||
## Environment & how to run
|
||
|
||
- Use the project venv: **`venv/bin/python <script>`** (has `UnityPy`, `bson`/pymongo, `scapy`, `Pillow`, `websockets`).
|
||
- Symlinks (git-ignored, machine-specific — repoint if the machine changes):
|
||
- `bundles/` → game `StreamingAssets/aa/StandaloneWindows64/` (35 bundles, ~6.8 GB)
|
||
- `Walkers/` → `…/LocalLow/Hologryph/SAND/Data/Walkers/` (live `.wbt` saves)
|
||
- Game install referenced by `bundle/` tools: `/mnt/d/SteamLibrary/steamapps/common/Sand Playtest`
|
||
(`GameAssembly.dll` + `Sand_Data/il2cpp_data/Metadata/global-metadata.dat`).
|
||
- IL2CPP source of truth: `il2cpp/dump.cs` (Il2CppDumper output — signatures/RVAs only, no method
|
||
bodies). `il2cpp/`, `ghidra/`, `snapshots/`, `bundles`, `Walkers`, `reverse/.secrets/` are
|
||
git-ignored (large/regenerable/secret). Live PlayFab token: `reverse/.secrets/playfab_token.json`.
|
||
- **Game runtime data dir** (`%USERPROFILE%\AppData\LocalLow\Hologryph\SAND`, here
|
||
`/mnt/c/Users/DownloadPizza/AppData/LocalLow/Hologryph/SAND/`) holds:
|
||
- **`Player.log`** — Unity log; check it to see what the client did (walker file reads:
|
||
`[FS_STANDALONE] ReadAllFilesAsync … Path: …/Data/Walkers`; master-server handshake:
|
||
`[MasterServer] … Login / Connection failed to /connect`). `Player-prev.log` = previous run.
|
||
- **`Data/Walkers/*.wbt`** — the live walker saves (the `Walkers/` symlink points here).
|
||
|
||
## The `.wbt` walker save format (current focus)
|
||
|
||
Envelope (RE'd from `XorCryptography.Encrypt`, verified byte-exact on all 5 local walkers):
|
||
```
|
||
save: Newtonsoft-BSON -> XOR encrypt -> gzip
|
||
load: gunzip -> XOR decrypt -> Newtonsoft-BSON parse
|
||
```
|
||
- XOR key (current build): `70 DD 1F 2A 0B 4A` (6 bytes), applied **per 0xA000-byte chunk** with the
|
||
key index reset to 0 at each chunk boundary: `decoded[i] = raw[i] XOR KEY[(i % 0xA000) % 6]`.
|
||
If a game update changes the key, recover it with no RE via `walker/recover_key.py`.
|
||
- `pymongo`'s `bson.encode` reproduces Newtonsoft.Bson byte-for-byte, so decode→encode is identity.
|
||
|
||
### The 5 hashes (a `.wbt` is a serialized `WalkerBlueprintDto`)
|
||
|
||
All = `MD5(UTF8(JsonConvert.SerializeObject(obj))).hexUPPER` — Newtonsoft compact JSON: no whitespace,
|
||
PascalCase, **member declaration order** (do NOT sort keys), nulls included, **enums as NAME strings**.
|
||
|
||
| Hash | Scope | Offline-computable? |
|
||
|------|-------|---------------------|
|
||
| `CompartmentsHash` | top-level: `MD5(JSON(Compartments list))` | **YES** — `walker/walker_hashes.py` |
|
||
| `ConnectionsHash` | top-level: `MD5(JSON(Connections list))` | **YES** — `walker/walker_hashes.py` |
|
||
| `CompartmentHash` | per-part: placement from CompartmentsDatabase | **YES** — constant per EpbId+placement |
|
||
| `DefinitionsHash` | top-level: `MD5(JSON(Compartments→CompartmentDefinitionDto))` | **NO** — server-sourced |
|
||
| `DefinitionHash` | per-part: `MD5(JSON(CompartmentDefinitionDto))` | **NO** — server-sourced |
|
||
|
||
The two **Definition** hashes hash the rich server-side `CompartmentDefinitionDto`, which is **not**
|
||
in the blueprint and **not** equal to the server's `GetCompartmentDefinitions` pricing DTO. They can
|
||
only be **harvested** (every part placed in-game writes them into the save) — see
|
||
`extracted/definition_hashes_known.json` (~18/126 parts) and `walker/harvest_hashes.py`. When editing
|
||
offline, `build_wbt.py pack` recomputes the two Compartment* hashes and **copies/reuses** the
|
||
Definition* hashes from the source — correct as long as the *set of part definitions* is unchanged;
|
||
it raises if you add a part whose `DefinitionHash` has never been harvested.
|
||
|
||
**Hash lifecycle (verified live 2026-06-16 — see `docs/TRAMPLER.md`):** the client **recomputes all 5
|
||
hashes on save** from its own database (same build ⇒ byte-identical hashes; no per-walker secret —
|
||
plain unsalted MD5, they're integrity/version-staleness markers, **not** security). Wiping any/all
|
||
hashes is harmless: a walker with blank hashes still **loads, lists, and opens in the editor**, and
|
||
**one in-editor save regenerates everything** (and mints a new file UUID + `UniqueId`). The `VERSION`
|
||
flag in `Player.log`'s `CheckValidBlueprint: ERRORS {0}, VERSION:{1}` (=`WalkerBlueprintContainer
|
||
.ValidateVersion`, which recomputes against the *current* DB/definitions) tracks only the **structural**
|
||
hashes (Compartments/Connections); Definition hashes don't affect client validation. Server-side upsert
|
||
validation is untested (the master-server `/connect` blocker is cleared as of 2026-06-16 — server back
|
||
up — but the live re-test has not been run yet).
|
||
|
||
Enum tables (from `dump.cs`): `ConnectionSlotType` 0 DOOR,1 HATCH,2 STRUCTURE,3 BALCONY,4 DECK ·
|
||
`ConnectionState` 0 DEFAULT,1 DOOR,2 OPEN · `ConnectionsCount` 0 FULL,1 PARTIAL,2 ERROR. Note the
|
||
master-server **WS form** serializes these as integers and omits null `EpbId`; the storage/hash form
|
||
uses name strings and includes `EpbId:null` — convert before hashing (`reverse/walkerdto_to_blueprint.py`).
|
||
|
||
## Tools (all scripts, with subcommands)
|
||
|
||
### `walker/` — `.wbt` save files (offline edit + hashes)
|
||
- **`sand.py`** — low-level toolkit. Subcommands: `decode <wbt> [-o]` · `snap (--all | files…)` ·
|
||
`diff <before> <after> [--no-filter]` · `check <wbt> [--no-filter]` · `watch <wbt> [--interval]`.
|
||
- **`build_wbt.py`** — high-level edit/build. Subcommands: `repack <wbt>` (identity sanity) ·
|
||
`rename <wbt> <first> <second> [-o]` (name indices 0–31) · `pack <wbt> -o out [--no-strict]`
|
||
(recompute hashes, write fresh) · `get-icon <wbt> [-o png]` · `set-icon <wbt> <png> [-o]`.
|
||
- **`harvest_hashes.py`** — scan saves+snapshots, merge `EpbId→{DefinitionHash,CompartmentHash}` into
|
||
the known-hashes table. Usage: `harvest_hashes.py [extra_dir …]`.
|
||
- **`recover_key.py`** — recover the XOR key from known-plaintext (the icon background pixel) after a
|
||
game update; no RE needed. Usage: `recover_key.py <wbt> …`.
|
||
- **`walker_hashes.py`** — reproduce `CompartmentsHash`/`ConnectionsHash` offline (the verified module).
|
||
|
||
### `bundle/` — Unity asset-bundle extraction (static data)
|
||
All use UnityPy with an IL2CPP TypeTreeGenerator (`GameAssembly.dll` + `global-metadata.dat`).
|
||
- **`unitybundle.py`** — minimal UnityFS extractor (LZ4/LZ4HC + uncompressed). `unitybundle.py [needle]`.
|
||
- **`odin_read.py`** — Sirenix **Odin** Binary (SerializedFormat=0) reader; used to decode
|
||
`SerializedBytes` blobs. `odin_read.py <file> [out]`.
|
||
- **`extract_data.py`** — generic MonoBehaviour extractor via typetrees → JSON in `extracted/`.
|
||
- **`extract_loot.py`** — loot/drop tables (Odin) → `extracted/loot_tables.json`.
|
||
- **`extract_production_lines.py`** — world conveyor single-recipe production lines → `extracted/production_lines.json`.
|
||
- **`extract_conveyor_placements.py`** — map islands→conveyors → `extracted/conveyor_placements.json`.
|
||
- **`extract_island_names.py`** — prefab→in-game Toponym (via `LandmarkBehaviour`) → `extracted/island_names.json`.
|
||
- **`extract_i2.py`** — I2 Localization English term table (manual parse) → i2 terms JSON.
|
||
- **`workbench_bundles.py`** — workbench EntityBlueprint → referenced `CraftingRecipeBundle`s.
|
||
- **`discord_recipes.py`** — emit Discord monospace recipe tables (workbench + production lines).
|
||
- **`component_census.py`** — tally ECS `$type` components across all 1446 EntityBlueprints. `component_census.py [filter]`.
|
||
- **`dump_blueprint.py`** — fully decode named EntityBlueprint(s): components + scalar fields. `dump_blueprint.py <base> …`.
|
||
- **`dump_loot_bytes.py`** / **`loot_probe.py`** — raw Odin byte dump / locate loot configs (analysis helpers).
|
||
|
||
### `reverse/` — network scraping + IL2CPP RE
|
||
- **`master_scrape.py`** — **the working master-server client** (2026-06-15 build). Two-socket
|
||
ClientMessage handshake: `/login` (no header) → `/connect` (`Authorization: <server ticket>`).
|
||
Flags: `--region {ger,eus,…}` `--go` (ARM network) `--data` `--user` `--client-version` `--insecure`
|
||
`--selftest`. Does nothing over the network without `--go`. See `docs/MASTER_SERVER.md` for the full
|
||
`ClientAction` enum / `OperationResult<T>` envelope.
|
||
- **`playfab_scrape.py`** — PlayFab REST (read-only), runs outside the game. Required `--title-id`;
|
||
auth via `--steam-ticket` or `--entity-token`; modes `--catalog` `--inventory` `--titledata`.
|
||
(Note: catalog/economy is disabled for this title — PlayFab is effectively auth-only.)
|
||
- **`capture_hosts.py`** — triage a pcap: DNS/SNI/endpoints, prints the PlayFab TitleId + master region. `capture_hosts.py <pcap>`.
|
||
- **`noise_filter.py`** — baseline-subtract a "SAND-off" pcap from a session pcap to isolate game traffic. `noise_filter.py <baseline> [session]`.
|
||
- **`ws_scrape.py`** — decode master-server WS frames from a pcap (older cleartext-era decoder; tries JSON/BSON/MessagePack). `ws_scrape.py <pcap> [--port --host --out]`.
|
||
- **`trampler_hashes.py`** — generate the blueprint hashes from scratch (Definition hash provisional). Self-test: run directly.
|
||
- **`walkerdto_to_blueprint.py`** — convert master-server `WalkerDto` (e.g. `GetExpedition.Trampler`) → loadable `WalkerBlueprintDto` + recompute hashes. Self-verifies via round-trip.
|
||
- **`render_trampler.py`** — render a multi-floor PNG map of a trampler (footprints, doors/hatches, guns) → `extracted/host_trampler_*.png`.
|
||
- **`il2cpp_re.py`** — IL2CPP helpers: VA↔file-offset, method index from `dump.cs`, xref finder, body disasm + float-constant extraction.
|
||
- **`resolve_decomp.py`** — annotate `ghidra/decomp.c` with symbol names + string literals. `resolve_decomp.py [substr]`.
|
||
- **`ghidra_decomp_targets.py`** / **`find_damage_writes.py`** — Ghidra headless decompile-target script / scan decomp for damage-write fingerprint.
|
||
|
||
### `wikigen/` — generate MediaWiki pages from `extracted/`
|
||
`make_items_wiki.py` · `make_crafting_wiki.py` · `make_loot_wiki.py` (→ `wiki/*.mediawiki`) ·
|
||
`render_wiki.py` (wikitext → standalone HTML in `wiki_site/`, git-ignored).
|
||
|
||
## Reference docs (`docs/`)
|
||
|
||
- **`MASTER_SERVER.md`** — master-server WebSocket protocol & scrape (transport, two-socket handshake, ClientAction enum, OperationResult).
|
||
- **`BACKEND_PLAYFAB.md`** — PlayFab is auth-only; read the corrections block at top.
|
||
- **`TRAMPLER.md`** — walker blueprint structure, the hashes, footprints, rendering.
|
||
- **`TASK.md`** — `.wbt` format cracked (BSON-verified) summary.
|
||
- **`PRODUCTION_LINES.md`**, **`SALES_VALUE.md`**, **`WEAPON_DAMAGE.md`** — static-data location maps (track across updates).
|
||
- **`SCRAPE_RUNBOOK.md`** — read-only live-scrape steps for when a playtest is online.
|
||
- **`GHIDRA.md`** — headless Ghidra on `GameAssembly.dll`: **inject Il2CppDumper symbols, don't full-analyze** (`ghidra/scripts/apply_il2cpp_symbols.py`); targeted decompile/disasm; the `_JAVA_OPTIONS` heap gotcha.
|
||
- **`BUNDLES.md`** (repo root) — inventory of the 35 asset bundles.
|
||
|
||
Operator memory lives in `~/.claude/projects/-home-downloadpizza-sand-tools/memory/` (loaded each session).
|