Pre-sim training data · VQ2-realistic imagery

DCL gameplay capture.

Until the official AIGP simulator drops, DCL The Game is the closest visual match to VQ2 we can get — DCL operates AIGP's technical side and the Steam game uses real 3D-scanned tracks with FPV camera views. Play the game, capture_dcl.py runs in the background recording + auto-labeling frames, and you wake up with thousands of VQ2-style training images in YOLO format.

Source

DCL The Game ($20 Steam)

3D-scanned real tracks

Output

dataset_gates_dcl/

YOLO format · auto-labeled

Rate

10 Hz capture · adaptive

dedup + dark-frame reject

Yield

~3-5K frames / 2-4 hr

varied gameplay

Why this matters. Your current detector trained on dataset_gates_mega (2.7K generic drone-racing images) gets 97.9% mAP50 on its own val set. VQ2 will be visually very different — realistic 3D-scanned environments, harder lighting, distractors. DCL gameplay is the highest-ROI source of VQ2-style training data available before the AIGP sim releases.

§ 01Setup

1 · Buy DCL The Game on Steam

store.steampowered.com/app/964570 · ~$20 · Windows native.

2 · Launch in WINDOWED mode

Fullscreen borderless also works but windowed is easiest for our capture tool to locate. In-game graphics settings → Display → Windowed, resolution 1920×1080 recommended.

3 · Dependencies (already installed)

The capture tool uses mss (screen capture) and pygetwindow (locate game window by title). Both are installed in the aigp venv. If you recreate the venv:

./aigp/Scripts/python.exe -m pip install mss pygetwindow

§ 02Run a capture session

Start DCL The Game, enter a race, then in a second terminal:

# Default: locates a window with "DCL" in title, uses current APEX detector
./aigp/Scripts/python.exe capture_dcl.py

# If auto-detect misses, pass the exact title substring
./aigp/Scripts/python.exe capture_dcl.py --window "DCL The Game"

# Last resort: fixed screen region (left, top, width, height)
./aigp/Scripts/python.exe capture_dcl.py --region 0 0 1920 1080

# Progress counter while you play (run in yet another terminal)
./aigp/Scripts/python.exe capture_dcl.py stats

Ctrl+C in the capture terminal stops the session cleanly. Data is saved incrementally — nothing is lost if you crash out.

Flags

Flag	Default	What it does
`--window`	`DCL`	Substring match on game window title
`--region`	—	Capture a fixed rect instead of a window
`--model`	`models/apex_yolo11n.pt`	Detector used for auto-labeling
`--fps`	`10`	Capture rate in Hz (higher = more frames, more dup reject)
`--conf`	`0.40`	Only save frames where detector confidence ≥ this

§ 03Quality gates (automatic)

The capture script only saves a frame if it survives three filters:

#	Filter	Why
1	Dark-frame reject mean brightness < 25 discarded	Menu screens, loading screens, fade-to-black are garbage for training.
2	Near-duplicate reject dHash hamming distance < 3 discarded	If you hover before a gate for 5 seconds, you'd otherwise get 50 near-identical frames wasting disk and biasing the loss.
3	No-gate reject confidence < 0.40 discarded	Frames with no visible gate add nothing. Hard negatives are captured separately (different pipeline).

Everything that passes is split 85% train / 15% val deterministically (hash on frame index) so you can rerun safely without re-splitting.

§ 04How to play for max dataset diversity

Don't just race fast laps. A detector trained on 3,000 "race-pace, gate-centered" frames generalizes worse than one trained on 1,500 frames that deliberately cover edge cases. Variety > volume.

Dimension	What to do	Why
Tracks	Fly ≥ 3 different DCL tracks per session	Different environments (stadium, industrial, outdoor) = visual diversity
Speed	Mix scenic laps + race-pace	Covers both sharp detail and motion-blurred frames
Angles	Deliberately pitch up / down / sideways	Off-axis gate views that race laps don't capture
Lighting	Play day and night tracks if available	VQ2 spec mentions "dynamic lighting"
Failures	Crash on purpose near gates sometimes	Partial occlusion, extreme angles, post-impact frames
Distance	Approach gates slow from far away	Small-gate detection (5-30 px) is what fails at 200m approach

§ 05After a session

1 · Check counts

./aigp/Scripts/python.exe capture_dcl.py stats

Expect ~1-2K frames per hour of varied gameplay. Less than that = either the detector is rejecting too much (lower --conf) or capture-region is wrong.

2 · Spot-check a few labels

Open dataset_gates_dcl/images/train/ in File Explorer. Pick 20 random files. For each, open the matching dataset_gates_dcl/labels/train/<name>.txt and mentally check the box coords overlay correctly. YOLO format is class cx cy w h (all normalized [0,1]).

3 · Audit auto-labels with `audit_dcl.py`

Auto-labeling gets you ~80% clean data. The last 20% — the obvious detector misses, duplicate hover frames that slipped through dedup, and gates that shouldn't have been labeled — needs a human eye. audit_dcl.py is a keyboard-driven batch reviewer that makes this take 2-3 sec per frame.

# Review everything (resumes from last stop)
./aigp/Scripts/python.exe audit_dcl.py

# Only review frames with ≥ 2 boxes (multi-gate shots are noisier)
./aigp/Scripts/python.exe audit_dcl.py --min-boxes 2

# Print progress
./aigp/Scripts/python.exe audit_dcl.py stats

# Restore a rejected pair if you changed your mind
./aigp/Scripts/python.exe audit_dcl.py --restore dcl_20260424_120000_001234

Key	Action
`Enter` / `→` / `k`	KEEP — advance
`Del` / `←` / `d`	REJECT — move to `rejected/` (recoverable), advance
`u`	UNDO last action
`Space` / `s`	SKIP — no decision, advance
`b`	BACK — previous file
`1..9`	jump N × 10 files ahead
`h`	toggle help overlay
`q` / `Esc`	QUIT (progress auto-saved)

Rejections are moved to dataset_gates_dcl/rejected/ — never deleted — so mistakes are recoverable via --restore. State persists in .audit_state.json so you can quit mid-audit and resume later.

4 · Add to the training mix

The easy way: add DCL data as a separate phase in train_apex.py detector by editing the dataset arg:

# Train on mega first, then fine-tune on DCL
./aigp/Scripts/python.exe train_apex.py detector --dataset dataset_gates_mega --epochs 200
./aigp/Scripts/python.exe train_apex.py detector --dataset dataset_gates_dcl --epochs 50

The smarter way: merge both datasets into a single YOLO config via symlinks or a combined data.yaml pointing at both roots. Pattern already used in dataset_gates_domain / dataset_gates_hardneg.

§ 06Privacy & pitfalls

The script captures your full game window. Close chat windows, browsers, and anything sensitive before starting a capture session — those frames end up in your dataset otherwise.

Pitfall	Symptom	Fix
Window not found	"No window matching 'DCL'"	Script prints all open windows. Pass the exact substring with `--window`.
0 frames saved	`stats` shows 0 images after 5 min	Detector conf threshold too high (default 0.40). Drop to 0.25: `--conf 0.25`.
Too many near-identical frames	Dataset full of hover shots	Play more, hover less. Or tighten dHash threshold (edit script line `_hamming(h, last_hash) < 3` → `< 5`).
GPU spiking during game	Game stutters	Detector inference uses ~1-2 GB VRAM. On RTX 5080 + DCL this should be fine. If issues, lower `--fps 5`.
DCL fullscreen-exclusive	mss captures desktop not game	Switch to Windowed or Fullscreen-Borderless in DCL's display settings.

§ 07Related

Training Runbook — what to do with the data after capture (commands + autotrainer)
Training Plan — where DCL data fits in the broader pre-sim strategy
Submission Guide — the final checklist before uploading

DCL gameplay capture.

§ 01Setup

1 · Buy DCL The Game on Steam

2 · Launch in WINDOWED mode

3 · Dependencies (already installed)

§ 02Run a capture session

Start DCL The Game, enter a race, then in a second terminal:

Flags

§ 03Quality gates (automatic)

§ 04How to play for max dataset diversity

§ 05After a session

1 · Check counts

2 · Spot-check a few labels

3 · Audit auto-labels with audit_dcl.py

4 · Add to the training mix

§ 06Privacy & pitfalls

§ 07Related

3 · Audit auto-labels with `audit_dcl.py`