Transformer detector. Best accuracy for VQ2 complex scenes.
Fastest detector. Ideal for VQ1 simple environment.
YOLO11n-pose: bbox + 4 gate corners in one pass -> PnP 6DOF.
Frame (640x480) -> APEX Detector YOLO11n (<7ms) -> Bounding Box
|
APEX Keypoints YOLO11n-pose (<10ms) -> Bbox + 4 Corners
|
cv2.solvePnP (IPPE_SQUARE) -> 6DOF Pose (<1ms)
|
distance = |tvec| -> IGPP EKF update (60Hz)
|
If no detection: IGPP blind prediction (IMU, 8s max)
If multiple frames: SAMD depth refinement (3-5x better)
|
APEX Policy: 24D obs -> [256,256,256] MLP -> 4D action (<2ms)
|
Total pipeline: <19ms (vs Swift 48ms, MonoRace ~30ms)
VQ2 uses "real 3D-scanned environments" with complex visuals. Current dataset is VQ1-quality (simple backgrounds, highlighted gates). Need synthetic renders with varied lighting, backgrounds, occlusion, and motion blur.
python generate_training_sets.py # Hard negatives + domain randomization python train_models.py rfdetr # Retrain on expanded dataset python evaluate_models.py --all # Compare all models
APEX implements Swift's #1 insight — DONE in ApexDroneEnv (train_apex.py)
# APEX Reward (implemented in train_apex.py ApexDroneEnv) r = 2.0 * dist_reduction # progress toward gate + 100 * (1 + speed_bonus) # gate passage (faster = more reward, up to 3x) + 0.3 * cos(cam, gate) # PERCEPTION: camera aimed at gate (SECRET SAUCE) + 0.2 * fov_centering # bonus for gate near FOV center + 0.15 * speed/25 # go fast when gate is visible - 0.02 * |action_delta| # smoothness penalty - 0.05 * altitude_error # stay at gate height - 200 * crash # collision = catastrophic - 0.005 * dt # time penalty # Camera FOV simulation: 120° H x 90° V, tilted 45° up # Gate visibility = in_fov AND detectable (>5px at current range) # Lost gate penalty grows with time: -min(t_blind * 0.1, 0.5)
# APEX — Full championship pipeline (~7.5h overnight) python train_apex.py # all 3 phases # Individual phases python train_apex.py detector # Phase 1: YOLO11n (~2h) python train_apex.py keypoints # Phase 2: YOLO11n-pose (~1.5h) python train_apex.py policy # Phase 3: PPO 10M steps (~4h) python train_apex.py policy --steps 5000000 # quick 5M test # Evaluate + export python train_apex.py eval # benchmark all models python train_apex.py export # package for submission # Or launch from dashboard — click APEX button below
Beat 3 human world champions. PPO G&CNet, 24D->4D at 500Hz. U-Net + PnP + EKF. 28.23 m/s peak.
First AI to beat humans. PPO with perception reward. Sim-to-real via residual models from 50s of real data.
End-to-end pixels to motors. DreamerV3 world model. VQ-VAE latent space. 21 m/s, 6g.
Create team account for simulator access, specs updates, and submission portal. Register Now
Implemented in ApexDroneEnv (train_apex.py). cos(camera, gate) reward + 24D obs + camera FOV sim + 4 course layouts.
python train_apex.py policy --steps 10000000
YOLO11n-pose: bbox + 4 corners in one forward pass. Auto-generated labels from existing bbox annotations.
python train_apex.py keypoints
Add motor dynamics (20ms time constant), quadratic prop forces, aero drag, battery voltage sag. Essential for RL transfer.
Synthetic renders + augmentation. Need domain randomization for VQ2's complex 3D-scanned environments.
python generate_training_sets.py python train_models.py rfdetr --data dataset_gates_10k
Robust perception under adversity. <50ms latency. Perception-aware trajectories. Self-calibration. Conservative TWR (3.8x). 8+ seconds blind navigation.
Rely on GPS. Use angle mode not rate mode. No fallback when detection fails. Inaccurate simulator. Ignore perception-action coupling. Don't test on complex scenes.