Pre-sim training data · VQ2-realistic imagery

DCL gameplay capture.

Until the official AIGP simulator drops, DCL The Game is the closest visual match to VQ2 we can get — DCL operates AIGP's technical side and the Steam game uses real 3D-scanned tracks with FPV camera views. Play the game, capture_dcl.py runs in the background recording + auto-labeling frames, and you wake up with thousands of VQ2-style training images in YOLO format.

Source
DCL The Game ($20 Steam)
3D-scanned real tracks
Output
dataset_gates_dcl/
YOLO format · auto-labeled
Rate
10 Hz capture · adaptive
dedup + dark-frame reject
Yield
~3-5K frames / 2-4 hr
varied gameplay
Why this matters. Your current detector trained on dataset_gates_mega (2.7K generic drone-racing images) gets 97.9% mAP50 on its own val set. VQ2 will be visually very different — realistic 3D-scanned environments, harder lighting, distractors. DCL gameplay is the highest-ROI source of VQ2-style training data available before the AIGP sim releases.

§ 01Setup

1 · Buy DCL The Game on Steam

store.steampowered.com/app/964570 · ~$20 · Windows native.

2 · Launch in WINDOWED mode

Fullscreen borderless also works but windowed is easiest for our capture tool to locate. In-game graphics settings → Display → Windowed, resolution 1920×1080 recommended.

3 · Dependencies (already installed)

The capture tool uses mss (screen capture) and pygetwindow (locate game window by title). Both are installed in the aigp venv. If you recreate the venv:

./aigp/Scripts/python.exe -m pip install mss pygetwindow

§ 02Run a capture session

Start DCL The Game, enter a race, then in a second terminal:

# Default: locates a window with "DCL" in title, uses current APEX detector
./aigp/Scripts/python.exe capture_dcl.py

# If auto-detect misses, pass the exact title substring
./aigp/Scripts/python.exe capture_dcl.py --window "DCL The Game"

# Last resort: fixed screen region (left, top, width, height)
./aigp/Scripts/python.exe capture_dcl.py --region 0 0 1920 1080

# Progress counter while you play (run in yet another terminal)
./aigp/Scripts/python.exe capture_dcl.py stats

Ctrl+C in the capture terminal stops the session cleanly. Data is saved incrementally — nothing is lost if you crash out.

Flags

FlagDefaultWhat it does
--windowDCLSubstring match on game window title
--regionCapture a fixed rect instead of a window
--modelmodels/apex_yolo11n.ptDetector used for auto-labeling
--fps10Capture rate in Hz (higher = more frames, more dup reject)
--conf0.40Only save frames where detector confidence ≥ this

§ 03Quality gates (automatic)

The capture script only saves a frame if it survives three filters:

#FilterWhy
1Dark-frame reject
mean brightness < 25 discarded
Menu screens, loading screens, fade-to-black are garbage for training.
2Near-duplicate reject
dHash hamming distance < 3 discarded
If you hover before a gate for 5 seconds, you'd otherwise get 50 near-identical frames wasting disk and biasing the loss.
3No-gate reject
confidence < 0.40 discarded
Frames with no visible gate add nothing. Hard negatives are captured separately (different pipeline).

Everything that passes is split 85% train / 15% val deterministically (hash on frame index) so you can rerun safely without re-splitting.

§ 04How to play for max dataset diversity

Don't just race fast laps. A detector trained on 3,000 "race-pace, gate-centered" frames generalizes worse than one trained on 1,500 frames that deliberately cover edge cases. Variety > volume.
DimensionWhat to doWhy
TracksFly ≥ 3 different DCL tracks per sessionDifferent environments (stadium, industrial, outdoor) = visual diversity
SpeedMix scenic laps + race-paceCovers both sharp detail and motion-blurred frames
AnglesDeliberately pitch up / down / sidewaysOff-axis gate views that race laps don't capture
LightingPlay day and night tracks if availableVQ2 spec mentions "dynamic lighting"
FailuresCrash on purpose near gates sometimesPartial occlusion, extreme angles, post-impact frames
DistanceApproach gates slow from far awaySmall-gate detection (5-30 px) is what fails at 200m approach

§ 05After a session

1 · Check counts

./aigp/Scripts/python.exe capture_dcl.py stats

Expect ~1-2K frames per hour of varied gameplay. Less than that = either the detector is rejecting too much (lower --conf) or capture-region is wrong.

2 · Spot-check a few labels

Open dataset_gates_dcl/images/train/ in File Explorer. Pick 20 random files. For each, open the matching dataset_gates_dcl/labels/train/<name>.txt and mentally check the box coords overlay correctly. YOLO format is class cx cy w h (all normalized [0,1]).

3 · Audit auto-labels with audit_dcl.py

Auto-labeling gets you ~80% clean data. The last 20% — the obvious detector misses, duplicate hover frames that slipped through dedup, and gates that shouldn't have been labeled — needs a human eye. audit_dcl.py is a keyboard-driven batch reviewer that makes this take 2-3 sec per frame.

# Review everything (resumes from last stop)
./aigp/Scripts/python.exe audit_dcl.py

# Only review frames with ≥ 2 boxes (multi-gate shots are noisier)
./aigp/Scripts/python.exe audit_dcl.py --min-boxes 2

# Print progress
./aigp/Scripts/python.exe audit_dcl.py stats

# Restore a rejected pair if you changed your mind
./aigp/Scripts/python.exe audit_dcl.py --restore dcl_20260424_120000_001234
KeyAction
Enter / / kKEEP — advance
Del / / dREJECT — move to rejected/ (recoverable), advance
uUNDO last action
Space / sSKIP — no decision, advance
bBACK — previous file
1..9jump N × 10 files ahead
htoggle help overlay
q / EscQUIT (progress auto-saved)

Rejections are moved to dataset_gates_dcl/rejected/ — never deleted — so mistakes are recoverable via --restore. State persists in .audit_state.json so you can quit mid-audit and resume later.

4 · Add to the training mix

The easy way: add DCL data as a separate phase in train_apex.py detector by editing the dataset arg:

# Train on mega first, then fine-tune on DCL
./aigp/Scripts/python.exe train_apex.py detector --dataset dataset_gates_mega --epochs 200
./aigp/Scripts/python.exe train_apex.py detector --dataset dataset_gates_dcl --epochs 50

The smarter way: merge both datasets into a single YOLO config via symlinks or a combined data.yaml pointing at both roots. Pattern already used in dataset_gates_domain / dataset_gates_hardneg.

§ 06Privacy & pitfalls

The script captures your full game window. Close chat windows, browsers, and anything sensitive before starting a capture session — those frames end up in your dataset otherwise.
PitfallSymptomFix
Window not found"No window matching 'DCL'"Script prints all open windows. Pass the exact substring with --window.
0 frames savedstats shows 0 images after 5 minDetector conf threshold too high (default 0.40). Drop to 0.25: --conf 0.25.
Too many near-identical framesDataset full of hover shotsPlay more, hover less. Or tighten dHash threshold (edit script line _hamming(h, last_hash) < 3< 5).
GPU spiking during gameGame stuttersDetector inference uses ~1-2 GB VRAM. On RTX 5080 + DCL this should be fine. If issues, lower --fps 5.
DCL fullscreen-exclusivemss captures desktop not gameSwitch to Windowed or Fullscreen-Borderless in DCL's display settings.

§ 07Related