The AIGP tech spec explicitly lists "vertical and horizontal obstacles, boundary elements, terrain and environmental structures" as scene elements. Our current stack only sees gates. This doc lays out the three-tier plan to close that gap — detect, avoid, learn — plus the synthetic-data pipeline using race-r3f.html as a free labelled-data generator.
race-r3f.html → Playwright renderFrom the official technical spec (260318_Technical_Spec_0001.pdf, § 3.1):
The race takes place within a high-fidelity real-time physics simulator:
• start gate
• sequential race gates
• finish gate
• vertical and horizontal obstacles
• boundary elements
• terrain and environmental structures
And from Round-1 / Round-2 preview notes (your submission guide):
| Layer | Obstacle handling |
|---|---|
Detector (apex_yolo11n) | nc=1 · sees only gate |
| Keypoint model | Gate corners only |
| Vision pipeline | Binary: "gate" / "not-gate". Not-gate is discarded. |
| Policy (PPO) | Trained on empty oval courses — never saw an obstacle |
| SimDrone physics | No obstacle collision geometry in the training env |
| Hard-neg mining | Suppresses false-positive detections on obstacles — but does nothing to avoid them. |
Extend the detector to nc=2: gate and obstacle.
race-r3f.html (see § 04 below)dataset_gates_obstacles/, 200 epochsmodels/apex_yolo11n.ptAdd a proximity-guard layer in the control pipeline that runs before the policy:
obstacle bbox covers > 30% of the frame → emergency pitch-upRetrain PPO with obstacles in the training env:
ApexDroneEnv training coursesSimDrone (AABB check per step)−10 × hit_obstacle, +0.1 × clearance_metersrace-r3f.html + PlaywrightThe win here: we already built race-r3f.html as a Three.js visualisation of VQ1 and VQ2 scenes. It knows every gate's 3D position and every obstacle's geometry. We can drive it headlessly, render random camera poses, and compute perfect YOLO bboxes by projecting the known scene graph into screen space. Zero manual labeling.
┌──────────────────────────────────┐ ┌───────────────────────────┐
│ Playwright (Python) │ │ Chromium (headless) │
│ render_synthetic_dataset.py │─────▶ │ race-r3f.html │
│ • sample random camera pose │ │ ?mode=dataset&preset=… │
│ • call __AIGP_CAPTURE__(pose) │ │ • tag gates/obstacles │
│ • decode base64 JPEG │ ◀─────│ userData.aigpClass │
│ • write image + .txt label │ │ • project 3D→2D bbox │
│ │ │ • toDataURL('jpeg') │
└──────────────────────────────────┘ └───────────────────────────┘
│
▼
dataset_gates_obstacles/
images/{train,val}/*.jpg
labels/{train,val}/*.txt
data.yaml (nc=2)
# Render 1000 VQ2 frames (industrial complex scene with obstacles)
./aigp/Scripts/python.exe render_synthetic_dataset.py --preset vq2 --num 1000
# VQ1 — more gates, no obstacles (mostly true-negative obstacle training)
./aigp/Scripts/python.exe render_synthetic_dataset.py --preset vq1 --num 500
# Check counts + class balance
./aigp/Scripts/python.exe render_synthetic_dataset.py stats
race-r3f.html exposes the dataWhen loaded with ?mode=dataset, the page:
userData.aigpClass = 'gate' and every obstacle with 'obstacle'preserveDrawingBuffer on the WebGL context (required for toDataURL)window.__AIGP_CAPTURE__(pose) which:
{image: dataURL, labels: [{cls, cx, cy, w, h}], width, height}window.__AIGP_READY__ = true so Playwright can pollInteractive mode (no ?mode=dataset) is completely unchanged — dataset hooks are additive and only activate via the query param.
render_synthetic_dataset.py samples realistic FPV poses:
R × 0.9, height 0.8-5.5 m (typical race altitudes)--seed for reproducible datasets./aigp/Scripts/python.exe render_synthetic_dataset.py --preset vq2 --num 2000
./aigp/Scripts/python.exe render_synthetic_dataset.py --preset vq1 --num 1000
./aigp/Scripts/python.exe render_synthetic_dataset.py stats
Target: 20-40% obstacle-box fraction in stats output. If too low, crank the VQ2 preset up or add more synthetic obstacle configurations to race-r3f.html.
Synthetic-only is risky — your detector could overfit to the Three.js rendering style. Mix with real sources:
# After capturing DCL gameplay (see dcl-capture-guide.html),
# create a combined data.yaml pointing at multiple roots, or use YOLO's
# list-of-dirs support. Example data.yaml:
path: C:/Users/pc/Downloads/grandprix-latest
train:
- dataset_gates_mega/images/train
- dataset_gates_dcl/images/train
- dataset_gates_obstacles/images/train
val:
- dataset_gates_obstacles/images/val
nc: 2
names: [gate, obstacle]
./aigp/Scripts/python.exe train_apex.py detector \
--dataset dataset_gates_obstacles --epochs 200
Start from COCO pretrained (default) — don't try to fine-tune the existing nc=1 gate model into nc=2, it'll miserably forget the gate class.
Extend benchmark_models.py to report per-class metrics. Pass criteria:
| Pitfall | Symptom | Fix |
|---|---|---|
| Synthetic-only overfit | mAP50 drops 30% on real DCL frames | Mix real DCL data at 40%+ of training set |
| Obstacle bboxes too loose | AABB of rotated pipe covers empty space | Switch to tighter oriented bbox (OBB) or filter by visible-pixel count |
| Behind-camera projection | Labels with impossibly-large boxes | Already filtered in _projectToScreen — rejects corners with v.z > 1 |
| Off-screen false labels | Tiny boxes at edges | Already filtered: boxes < 4px or >80% off-screen are dropped |
| Catastrophic forgetting | After nc=2 training, gate recall tanks | Don't fine-tune nc=1 → nc=2. Start from COCO pretrained or train from scratch. |
| Class imbalance | Many more gate boxes than obstacle | Use YOLO's class_weights or oversample obstacle-heavy frames |
| Playwright slow | 5 FPS capture rate | toDataURL is the bottleneck. 1000 frames = ~3 min. Acceptable. |