Built and unit-tested ahead of a live playtest window: - reverse/capture_hosts.py: pcap -> DNS/SNI/endpoints in order; extracts PlayFab TitleId, flags hologryph master-server region + config CDN. - reverse/ws_scrape.py: TCP reassembly + RFC-6455 framing for the cleartext ws://<region>. hologryph.com/gameclient/ stream; decodes JSON/BSON/MessagePack; auto-labels ServerDto, CompartmentDefinitionDto, ResearchNodeJsonDto, OperationResult, etc. No MITM needed. - reverse/playfab_scrape.py: LoginWithSteam (or captured EntityToken) -> Catalog/SearchItems (+ Inventory/TitleData); prices resolved to item names. Read-only. - docs/SCRAPE_RUNBOOK.md: turnkey steps for when servers are online.
68 lines
3.8 KiB
Markdown
68 lines
3.8 KiB
Markdown
# Live-scrape runbook — when a playtest is online
|
|
|
|
Everything below is read-only and runs outside the game process (no BattlEye interaction).
|
|
Tooling is built and unit-tested; the only thing that needs a live backend is the data itself.
|
|
|
|
## 0. Capture (once servers are up)
|
|
|
|
1. `ipconfig /flushdns` (so hostnames show as clean DNS queries, incl. the PlayFab TitleId).
|
|
2. Start a packet capture on the game NIC (Wireshark, or `pktmon`/`dumpcap`). Save as `.pcapng`.
|
|
- Master-server traffic is **cleartext `ws://` on port 80** — Wireshark reads it directly,
|
|
**no MITM/cert needed**.
|
|
- PlayFab is HTTPS/443 — to read its bodies you need your MITM (cert already installed) on 443,
|
|
or use the REST scraper (step 3) instead.
|
|
3. Launch SAND, **click through past the "no servers"/welcome dialog and let it log in**, then open
|
|
the screens whose data you want (walker editor → compartment defs; research tree; store → prices).
|
|
Keep capturing through it. Stop the capture.
|
|
|
|
## 1. Triage the capture → get the TitleId + confirm the master server
|
|
|
|
```bash
|
|
venv/bin/python reverse/capture_hosts.py <capture.pcapng>
|
|
```
|
|
Prints DNS/SNI/endpoints in order and a **BACKENDS DETECTED** block:
|
|
- `PlayFab host=<id>.playfabapi.com ** TitleId = <ID> **` ← the one constant the REST scraper needs
|
|
- `Master server host=<region>.hologryph.com (ws://80 cleartext)`
|
|
- `Config CDN host=sandconfigstorage…`
|
|
|
|
## 2. Master server (compartments + research tree + server list) — cleartext, no auth replay
|
|
|
|
```bash
|
|
venv/bin/python reverse/ws_scrape.py <capture.pcapng> --out extracted/master_ws.json
|
|
```
|
|
Reassembles the port-80 WebSocket to `*.hologryph.com/gameclient/`, parses RFC-6455 frames, and
|
|
decodes each message (tries JSON → BSON → MessagePack — the game's `IDataSerializer` is JSON-likely).
|
|
Messages are auto-tagged when their shape matches a known DTO:
|
|
`ServerDto`, `RegionInfo`, **`CompartmentDefinitionDto`** (HP/Weight/Properties/prices),
|
|
**`ResearchNodeJsonDto`** (connections via `RequiredNodesIds`/`DependentNodesIds`, costs via
|
|
`ResearchPrice`), `ItemDto`/`ShopItemDto`/`PriceDto`, `OperationResult`, `IClientEvent`.
|
|
If it finds no WS stream, the capture didn't span the master-server connection (re-capture through
|
|
the login), or try `--port`/`--host`.
|
|
|
|
> First run, eyeball one frame to confirm the encoding (JSON vs BSON). The decoder already handles
|
|
> both; this is just a sanity check.
|
|
|
|
## 3. PlayFab prices / catalog / inventory
|
|
|
|
Either read them from the MITM'd 443 capture, **or** pull them directly (cleaner, gets the *full*
|
|
catalog, more than the client requests):
|
|
|
|
```bash
|
|
# with a Steam auth ticket (captured, or minted via Steamworks GetAuthSessionTicket):
|
|
venv/bin/python reverse/playfab_scrape.py --title-id <ID> --steam-ticket <hex> --catalog --inventory
|
|
|
|
# or skip login with an EntityToken lifted from your MITM capture:
|
|
venv/bin/python reverse/playfab_scrape.py --title-id <ID> --entity-token <tok> --catalog
|
|
```
|
|
`--catalog` → `extracted/playfab_catalog.json`: every item with `PriceOptions` (→ currency-item +
|
|
amount, names resolved via `extracted/item_names.json`) and `DisplayProperties` (check here for any
|
|
catalog-authored base stats). `--inventory` → wallet + items + transaction history. `--titledata`
|
|
→ `Client/GetTitleData` config blobs. Read-only endpoints only — no write/purchase calls.
|
|
|
|
## Notes / unknowns to confirm live
|
|
- **WS payload encoding** (JSON vs BSON): decoder handles both; confirm on first capture.
|
|
- **Steam ticket reuse**: tickets are short-lived/single-use — if `--steam-ticket` fails, lift an
|
|
`EntityToken` from the MITM capture and use `--entity-token` instead.
|
|
- **Damage**: still server-computed; check `DisplayProperties` (catalog) and
|
|
`CompartmentDefinitionDto.Properties` (master server) for any base values — don't assume present.
|