AIGP — Training Command Center

Competition Timeline

NOW

Pre-Sim Prep
Train models
Build pipeline

MAY

Simulator
Release

MAY-JUL

VQ1 + VQ2
Virtual Quals

SEP

Physical
So. California

NOV

Grand Prix
Columbus, OH

Pipeline Readiness

RF-DETR Gate Detection97.9% mAP@50

YOLO11n Gate DetectionTrained + ONNX

U-Net Gate SegmentationComplete + ONNX

IGPP EKF State Estimation8s blind nav

SAMD Multi-Frame Depth3-5x better PnP

PPO RL Training Pipeline2M steps trained

Race Pipeline State Machine6-phase complete

MPC Controller (CasADi)Attitude + rate

APEX Detector (YOLO11n)train_apex.py Phase 1

APEX Keypoints (Pose)train_apex.py Phase 2 — PnP corners

APEX Policy (PPO 24D)train_apex.py Phase 3 — perception reward

RL Perception-Aware RewardDONE — cos(camera, gate) in ApexDroneEnv

24D Observation SpaceDONE — upgraded from 13D

Gate Corner KeypointsDONE — YOLO11n-pose, auto-generated labels

IGPP/SAMD IntegrationTODO — wire into pipeline

DCL Simulator AdapterWaiting for May release

Key Metrics

APEX

PIPELINE READY

2,759

TRAINING IMAGES

24D

OBSERVATION DIM

<19ms

E2E LATENCY

APEX PHASES

Architecture — APEX Championship Pipeline

Gate Detection Models

RF-DETR

Transformer detector. Best accuracy for VQ2 complex scenes.

mAP@5097.9%

Latency~8ms (needs TensorRT)

ONNX ExportReady

YOLO11n

Fastest detector. Ideal for VQ1 simple environment.

mAP@5095.2%

Latency<3ms

ONNX + TRTReady

APEX Keypoints

YOLO11n-pose: bbox + 4 gate corners in one pass -> PnP 6DOF.

ModelYOLO11n-pose (train_apex.py Phase 2)

DatasetAuto-generated from bbox labels

PnP Solvecv2.solvePnP IPPE_SQUARE <1ms

Perception Pipeline Flow

Frame (640x480) -> APEX Detector YOLO11n (<7ms) -> Bounding Box
                                                          |
            APEX Keypoints YOLO11n-pose (<10ms) -> Bbox + 4 Corners
                                                          |
                       cv2.solvePnP (IPPE_SQUARE) -> 6DOF Pose (<1ms)
                                                          |
                       distance = |tvec| -> IGPP EKF update (60Hz)
                                                          |
                       If no detection: IGPP blind prediction (IMU, 8s max)
                       If multiple frames: SAMD depth refinement (3-5x better)
                                                          |
                       APEX Policy: 24D obs -> [256,256,256] MLP -> 4D action (<2ms)
                                                          |
                       Total pipeline: <19ms (vs Swift 48ms, MonoRace ~30ms)

Dataset Status

2,759

TOTAL IMAGES

10,000+

TARGET

DATASETS

~7,200

IMAGES NEEDED

Expand to 10K+ for VQ2 Robustness

VQ2 uses "real 3D-scanned environments" with complex visuals. Current dataset is VQ1-quality (simple backgrounds, highlighted gates). Need synthetic renders with varied lighting, backgrounds, occlusion, and motion blur.

python generate_training_sets.py    # Hard negatives + domain randomization
python train_models.py rfdetr       # Retrain on expanded dataset
python evaluate_models.py --all     # Compare all models

Reinforcement Learning — PPO Policy

APEX Policy (NEW)

AlgorithmPPO (stable-baselines3)

Observation24D (state + perception + lookahead)

RewardPerception-aware (Swift's secret sauce)

Network[256, 256, 256] MLP (MonoRace G&CNet)

Camera FOV Sim120° H x 90° V, 45° tilt

Course Randomization4 layouts (oval, figure8, zigzag, sprint)

ONNX ExportReady

Training Status

Perception-Aware RewardDONE — cos(camera, gate) in ApexDroneEnv

24D Observation SpaceDONE — bearing+vis+speed+lookahead+alt

10M Training StepsReady — ~4h on RTX 5080

Domain RandomizationTODO — mass, drag, motor dynamics

Asymmetric CriticTODO — privileged info for value fn

Reward Function — The Key to Winning

APEX implements Swift's #1 insight — DONE in ApexDroneEnv (train_apex.py)

# APEX Reward (implemented in train_apex.py ApexDroneEnv)
r = 2.0 * dist_reduction       # progress toward gate
  + 100 * (1 + speed_bonus)    # gate passage (faster = more reward, up to 3x)
  + 0.3 * cos(cam, gate)       # PERCEPTION: camera aimed at gate (SECRET SAUCE)
  + 0.2 * fov_centering        # bonus for gate near FOV center
  + 0.15 * speed/25            # go fast when gate is visible
  - 0.02 * |action_delta|      # smoothness penalty
  - 0.05 * altitude_error      # stay at gate height
  - 200 * crash                # collision = catastrophic
  - 0.005 * dt                 # time penalty

# Camera FOV simulation: 120° H x 90° V, tilted 45° up
# Gate visibility = in_fov AND detectable (>5px at current range)
# Lost gate penalty grows with time: -min(t_blind * 0.1, 0.5)

Training Commands

# APEX — Full championship pipeline (~7.5h overnight)
python train_apex.py                     # all 3 phases

# Individual phases
python train_apex.py detector            # Phase 1: YOLO11n (~2h)
python train_apex.py keypoints           # Phase 2: YOLO11n-pose (~1.5h)
python train_apex.py policy              # Phase 3: PPO 10M steps (~4h)
python train_apex.py policy --steps 5000000   # quick 5M test

# Evaluate + export
python train_apex.py eval                # benchmark all models
python train_apex.py export              # package for submission

# Or launch from dashboard — click APEX button below

Reference Systems

MonoRace (A2RL 2025)

Beat 3 human world champions. PPO G&CNet, 24D->4D at 500Hz. U-Net + PnP + EKF. 28.23 m/s peak.

Swift (Nature 2023)

First AI to beat humans. PPO with perception reward. Sim-to-real via residual models from 50s of real data.

SkyDreamer (ICLR 2025)

End-to-end pixels to motors. DreamerV3 world model. VQ-VAE latent space. 21 m/s, 6g.

Training Jobs

No active jobs — start training below

Live Metrics

mAP@50

Loss

Live Log

Waiting for training output…

Start Training

Model Comparison

No evaluation results yet — train models then click Evaluate All

Pre-Simulator Action Items (Now through May)

1. Register at dcl-project.com

Create team account for simulator access, specs updates, and submission portal. Register Now

2. Perception-Aware RL Reward — DONE

Implemented in ApexDroneEnv (train_apex.py). cos(camera, gate) reward + 24D obs + camera FOV sim + 4 course layouts.

python train_apex.py policy --steps 10000000

3. Gate Corner Keypoints — DONE

YOLO11n-pose: bbox + 4 corners in one forward pass. Auto-generated labels from existing bbox annotations.

python train_apex.py keypoints

4. Upgrade SimDrone Physics

Add motor dynamics (20ms time constant), quadratic prop forces, aero drag, battery voltage sag. Essential for RL transfer.

5. Expand Dataset to 10K+

Synthetic renders + augmentation. Need domain randomization for VQ2's complex 3D-scanned environments.

python generate_training_sets.py
python train_models.py rfdetr --data dataset_gates_10k

What Wins vs What Loses

Winners

Robust perception under adversity. <50ms latency. Perception-aware trajectories. Self-calibration. Conservative TWR (3.8x). 8+ seconds blind navigation.

Losers

Rely on GPS. Use angle mode not rate mode. No fallback when detection fails. Inaccurate simulator. Ignore perception-action coupling. Don't test on complex scenes.

Key Performance Targets

>99%

DETECT mAP

<5ms

DETECT LATENCY

<5%

PnP DEPTH ERR

<50ms

E2E LATENCY

5M+

RL STEPS