Spec gap · VQ1 + VQ2 requirement

Obstacle detection.

The AIGP tech spec explicitly lists "vertical and horizontal obstacles, boundary elements, terrain and environmental structures" as scene elements. Our current stack only sees gates. This doc lays out the three-tier plan to close that gap — detect, avoid, learn — plus the synthetic-data pipeline using race-r3f.html as a free labelled-data generator.

Current state
No obstacle handling at all
detector is nc=1 (gate only)
Target
nc=2 detector + avoidance layer
gate + obstacle classes
Data source
race-r3f.html → Playwright render
auto-labeled synthetic
Timeline
2-3 days to Tier 1 + 2
Tier 3 after sim release
The risk if we skip this: VQ2 places the drone in 3D-scanned realistic environments with industrial props, pipes, scaffolding, pillars. A detector that only sees gates will happily fly the drone into a concrete wall while chasing a gate behind it. Every gate-passage reward means nothing if the drone doesn't finish.

§ 01What the spec says

From the official technical spec (260318_Technical_Spec_0001.pdf, § 3.1):

The race takes place within a high-fidelity real-time physics simulator:
  • start gate
  • sequential race gates
  • finish gate
  • vertical and horizontal obstacles
  • boundary elements
  • terrain and environmental structures

And from Round-1 / Round-2 preview notes (your submission guide):

§ 02What we currently do (nothing)

LayerObstacle handling
Detector (apex_yolo11n)nc=1 · sees only gate
Keypoint modelGate corners only
Vision pipelineBinary: "gate" / "not-gate". Not-gate is discarded.
Policy (PPO)Trained on empty oval courses — never saw an obstacle
SimDrone physicsNo obstacle collision geometry in the training env
Hard-neg miningSuppresses false-positive detections on obstacles — but does nothing to avoid them.

§ 03Three-tier plan

Tier 1 — "See obstacles" 2-3 days

Extend the detector to nc=2: gate and obstacle.

Tier 2 — "Avoid on sight" 1-2 days after Tier 1

Add a proximity-guard layer in the control pipeline that runs before the policy:

Tier 3 — "Learn to fly through" 4-6 hr training

Retrain PPO with obstacles in the training env:

For VQ1: Tier 1 + 2 alone should suffice — VQ1 scores completion, not speed (per training plan). Defensive flying is fine.
For VQ2: Tier 3 matters — you need the policy to weave through obstacles at race pace, not just avoid them defensively.

§ 04Synthetic data pipeline — race-r3f.html + Playwright

The win here: we already built race-r3f.html as a Three.js visualisation of VQ1 and VQ2 scenes. It knows every gate's 3D position and every obstacle's geometry. We can drive it headlessly, render random camera poses, and compute perfect YOLO bboxes by projecting the known scene graph into screen space. Zero manual labeling.

Architecture

  ┌──────────────────────────────────┐       ┌───────────────────────────┐
  │  Playwright (Python)             │       │  Chromium (headless)      │
  │  render_synthetic_dataset.py     │─────▶ │  race-r3f.html            │
  │    • sample random camera pose   │       │    ?mode=dataset&preset=… │
  │    • call __AIGP_CAPTURE__(pose) │       │    • tag gates/obstacles  │
  │    • decode base64 JPEG          │ ◀─────│      userData.aigpClass   │
  │    • write image + .txt label    │       │    • project 3D→2D bbox   │
  │                                  │       │    • toDataURL('jpeg')    │
  └──────────────────────────────────┘       └───────────────────────────┘
                                                         │
                                                         ▼
                                          dataset_gates_obstacles/
                                            images/{train,val}/*.jpg
                                            labels/{train,val}/*.txt
                                            data.yaml (nc=2)

Usage

# Render 1000 VQ2 frames (industrial complex scene with obstacles)
./aigp/Scripts/python.exe render_synthetic_dataset.py --preset vq2 --num 1000

# VQ1 — more gates, no obstacles (mostly true-negative obstacle training)
./aigp/Scripts/python.exe render_synthetic_dataset.py --preset vq1 --num 500

# Check counts + class balance
./aigp/Scripts/python.exe render_synthetic_dataset.py stats

How race-r3f.html exposes the data

When loaded with ?mode=dataset, the page:

  1. Disables the auto-orbit camera and hides the drone
  2. Tags every gate mesh with userData.aigpClass = 'gate' and every obstacle with 'obstacle'
  3. Enables preserveDrawingBuffer on the WebGL context (required for toDataURL)
  4. Registers window.__AIGP_CAPTURE__(pose) which:
    • Moves the camera to the requested pose
    • Forces a synchronous render
    • Walks the scene graph and projects every tagged object to 2D screen-space AABB
    • Returns {image: dataURL, labels: [{cls, cx, cy, w, h}], width, height}
  5. Sets window.__AIGP_READY__ = true so Playwright can poll

Interactive mode (no ?mode=dataset) is completely unchanged — dataset hooks are additive and only activate via the query param.

Camera-pose sampling

render_synthetic_dataset.py samples realistic FPV poses:

§ 05Training workflow

1 · Generate synthetic data

./aigp/Scripts/python.exe render_synthetic_dataset.py --preset vq2 --num 2000
./aigp/Scripts/python.exe render_synthetic_dataset.py --preset vq1 --num 1000
./aigp/Scripts/python.exe render_synthetic_dataset.py stats

Target: 20-40% obstacle-box fraction in stats output. If too low, crank the VQ2 preset up or add more synthetic obstacle configurations to race-r3f.html.

2 · Mix with positive gate data

Synthetic-only is risky — your detector could overfit to the Three.js rendering style. Mix with real sources:

# After capturing DCL gameplay (see dcl-capture-guide.html),
# create a combined data.yaml pointing at multiple roots, or use YOLO's
# list-of-dirs support. Example data.yaml:
path: C:/Users/pc/Downloads/grandprix-latest
train:
  - dataset_gates_mega/images/train
  - dataset_gates_dcl/images/train
  - dataset_gates_obstacles/images/train
val:
  - dataset_gates_obstacles/images/val
nc: 2
names: [gate, obstacle]

3 · Retrain with nc=2

./aigp/Scripts/python.exe train_apex.py detector \
  --dataset dataset_gates_obstacles --epochs 200

Start from COCO pretrained (default) — don't try to fine-tune the existing nc=1 gate model into nc=2, it'll miserably forget the gate class.

4 · Validate the two-class detector

Extend benchmark_models.py to report per-class metrics. Pass criteria:

§ 06Pitfalls

PitfallSymptomFix
Synthetic-only overfitmAP50 drops 30% on real DCL framesMix real DCL data at 40%+ of training set
Obstacle bboxes too looseAABB of rotated pipe covers empty spaceSwitch to tighter oriented bbox (OBB) or filter by visible-pixel count
Behind-camera projectionLabels with impossibly-large boxesAlready filtered in _projectToScreen — rejects corners with v.z > 1
Off-screen false labelsTiny boxes at edgesAlready filtered: boxes < 4px or >80% off-screen are dropped
Catastrophic forgettingAfter nc=2 training, gate recall tanksDon't fine-tune nc=1 → nc=2. Start from COCO pretrained or train from scratch.
Class imbalanceMany more gate boxes than obstacleUse YOLO's class_weights or oversample obstacle-heavy frames
Playwright slow5 FPS capture ratetoDataURL is the bottleneck. 1000 frames = ~3 min. Acceptable.

§ 07Related