AIGPCOMMAND CENTER
--:--:--
OVERVIEW
PERCEPTION
RL POLICY
TRAIN MODELS
GAME PLAN
Competition Timeline
NOW
Pre-Sim Prep
Train models
Build pipeline
MAY
Simulator
Release
MAY-JUL
VQ1 + VQ2
Virtual Quals
SEP
Physical
So. California
NOV
Grand Prix
Columbus, OH
Pipeline Readiness
RF-DETR Gate Detection97.9% mAP@50
YOLO11n Gate DetectionTrained + ONNX
U-Net Gate SegmentationComplete + ONNX
IGPP EKF State Estimation8s blind nav
SAMD Multi-Frame Depth3-5x better PnP
PPO RL Training Pipeline2M steps trained
Race Pipeline State Machine6-phase complete
MPC Controller (CasADi)Attitude + rate
APEX Detector (YOLO11n)train_apex.py Phase 1
APEX Keypoints (Pose)train_apex.py Phase 2 — PnP corners
APEX Policy (PPO 24D)train_apex.py Phase 3 — perception reward
RL Perception-Aware RewardDONE — cos(camera, gate) in ApexDroneEnv
24D Observation SpaceDONE — upgraded from 13D
Gate Corner KeypointsDONE — YOLO11n-pose, auto-generated labels
IGPP/SAMD IntegrationTODO — wire into pipeline
DCL Simulator AdapterWaiting for May release
Key Metrics
APEX
PIPELINE READY
2,759
TRAINING IMAGES
24D
OBSERVATION DIM
<19ms
E2E LATENCY
3
APEX PHASES
Architecture — APEX Championship Pipeline
FPV Camera (12MP, 120deg FOV, fixed front-facing) | v +-------------------+ +------------------------+ | APEX Detector |---->| APEX Keypoints | <19ms total | YOLO11n (2.6M) | | YOLO11n-pose | pipeline latency | <7ms, >95% rec | | 4 corners -> PnP | +-------------------+ +----------+-------------+ | 6DOF Gate Pose v +-------------------+ +------------------------+ | IMU |---->| IGPP EKF |<-- PnP Update (60Hz) | Accel + Gyro | | 16D State | | 120Hz | | Blind nav: 8 sec | +-------------------+ +----------+-------------+ | State Estimate v +------------------------+ | APEX Policy (PPO) | Secret sauce: | 24D obs -> 4D action | cos(camera, gate) | [256,256,256] MLP | perception reward +----------+-------------+ | thrust + body rates v +------------------------+ | MAVLink Interface | | SET_ATTITUDE_TARGET | | 120Hz commands | +------------------------+
Gate Detection Models

RF-DETR

Transformer detector. Best accuracy for VQ2 complex scenes.

mAP@5097.9%
Latency~8ms (needs TensorRT)
ONNX ExportReady

YOLO11n

Fastest detector. Ideal for VQ1 simple environment.

mAP@5095.2%
Latency<3ms
ONNX + TRTReady

APEX Keypoints

YOLO11n-pose: bbox + 4 gate corners in one pass -> PnP 6DOF.

ModelYOLO11n-pose (train_apex.py Phase 2)
DatasetAuto-generated from bbox labels
PnP Solvecv2.solvePnP IPPE_SQUARE <1ms
Perception Pipeline Flow
Frame (640x480) -> APEX Detector YOLO11n (<7ms) -> Bounding Box
                                                          |
            APEX Keypoints YOLO11n-pose (<10ms) -> Bbox + 4 Corners
                                                          |
                       cv2.solvePnP (IPPE_SQUARE) -> 6DOF Pose (<1ms)
                                                          |
                       distance = |tvec| -> IGPP EKF update (60Hz)
                                                          |
                       If no detection: IGPP blind prediction (IMU, 8s max)
                       If multiple frames: SAMD depth refinement (3-5x better)
                                                          |
                       APEX Policy: 24D obs -> [256,256,256] MLP -> 4D action (<2ms)
                                                          |
                       Total pipeline: <19ms (vs Swift 48ms, MonoRace ~30ms)
Dataset Status
2,759
TOTAL IMAGES
10,000+
TARGET
4
DATASETS
~7,200
IMAGES NEEDED

Expand to 10K+ for VQ2 Robustness

VQ2 uses "real 3D-scanned environments" with complex visuals. Current dataset is VQ1-quality (simple backgrounds, highlighted gates). Need synthetic renders with varied lighting, backgrounds, occlusion, and motion blur.

python generate_training_sets.py    # Hard negatives + domain randomization
python train_models.py rfdetr       # Retrain on expanded dataset
python evaluate_models.py --all     # Compare all models
Reinforcement Learning — PPO Policy

APEX Policy (NEW)

AlgorithmPPO (stable-baselines3)
Observation24D (state + perception + lookahead)
RewardPerception-aware (Swift's secret sauce)
Network[256, 256, 256] MLP (MonoRace G&CNet)
Camera FOV Sim120° H x 90° V, 45° tilt
Course Randomization4 layouts (oval, figure8, zigzag, sprint)
ONNX ExportReady

Training Status

Perception-Aware RewardDONE — cos(camera, gate) in ApexDroneEnv
24D Observation SpaceDONE — bearing+vis+speed+lookahead+alt
10M Training StepsReady — ~4h on RTX 5080
Domain RandomizationTODO — mass, drag, motor dynamics
Asymmetric CriticTODO — privileged info for value fn
Reward Function — The Key to Winning

APEX implements Swift's #1 insight — DONE in ApexDroneEnv (train_apex.py)

# APEX Reward (implemented in train_apex.py ApexDroneEnv)
r = 2.0 * dist_reduction       # progress toward gate
  + 100 * (1 + speed_bonus)    # gate passage (faster = more reward, up to 3x)
  + 0.3 * cos(cam, gate)       # PERCEPTION: camera aimed at gate (SECRET SAUCE)
  + 0.2 * fov_centering        # bonus for gate near FOV center
  + 0.15 * speed/25            # go fast when gate is visible
  - 0.02 * |action_delta|      # smoothness penalty
  - 0.05 * altitude_error      # stay at gate height
  - 200 * crash                # collision = catastrophic
  - 0.005 * dt                 # time penalty

# Camera FOV simulation: 120° H x 90° V, tilted 45° up
# Gate visibility = in_fov AND detectable (>5px at current range)
# Lost gate penalty grows with time: -min(t_blind * 0.1, 0.5)
Training Commands
# APEX — Full championship pipeline (~7.5h overnight)
python train_apex.py                     # all 3 phases

# Individual phases
python train_apex.py detector            # Phase 1: YOLO11n (~2h)
python train_apex.py keypoints           # Phase 2: YOLO11n-pose (~1.5h)
python train_apex.py policy              # Phase 3: PPO 10M steps (~4h)
python train_apex.py policy --steps 5000000   # quick 5M test

# Evaluate + export
python train_apex.py eval                # benchmark all models
python train_apex.py export              # package for submission

# Or launch from dashboard — click APEX button below
Reference Systems

MonoRace (A2RL 2025)

Beat 3 human world champions. PPO G&CNet, 24D->4D at 500Hz. U-Net + PnP + EKF. 28.23 m/s peak.

Swift (Nature 2023)

First AI to beat humans. PPO with perception reward. Sim-to-real via residual models from 50s of real data.

SkyDreamer (ICLR 2025)

End-to-end pixels to motors. DreamerV3 world model. VQ-VAE latent space. 21 m/s, 6g.

Training Jobs
No active jobs — start training below
Live Metrics
mAP@50
Loss
Live Log
Waiting for training output…
Start Training
Model Comparison
No evaluation results yet — train models then click Evaluate All
Pre-Simulator Action Items (Now through May)

1. Register at dcl-project.com

Create team account for simulator access, specs updates, and submission portal. Register Now

2. Perception-Aware RL Reward — DONE

Implemented in ApexDroneEnv (train_apex.py). cos(camera, gate) reward + 24D obs + camera FOV sim + 4 course layouts.

python train_apex.py policy --steps 10000000

3. Gate Corner Keypoints — DONE

YOLO11n-pose: bbox + 4 corners in one forward pass. Auto-generated labels from existing bbox annotations.

python train_apex.py keypoints

4. Upgrade SimDrone Physics

Add motor dynamics (20ms time constant), quadratic prop forces, aero drag, battery voltage sag. Essential for RL transfer.

5. Expand Dataset to 10K+

Synthetic renders + augmentation. Need domain randomization for VQ2's complex 3D-scanned environments.

python generate_training_sets.py
python train_models.py rfdetr --data dataset_gates_10k
What Wins vs What Loses

Winners

Robust perception under adversity. <50ms latency. Perception-aware trajectories. Self-calibration. Conservative TWR (3.8x). 8+ seconds blind navigation.

Losers

Rely on GPS. Use angle mode not rate mode. No fallback when detection fails. Inaccurate simulator. Ignore perception-action coupling. Don't test on complex scenes.

Key Performance Targets
>99%
DETECT mAP
<5ms
DETECT LATENCY
<5%
PnP DEPTH ERR
<50ms
E2E LATENCY
5M+
RL STEPS