End-to-end autonomous drone racing pipeline • Vision-only navigation • Ctrl+P to save as PDF
This system pilots a racing drone autonomously through a sequence of gates at high speed. The architecture is based on MonoRace (2025 A2RL Autonomous Racing League champion): a U-Net segmentation network identifies gate pixels, RANSAC fits edge lines to extract sub-pixel corners, Perspective-n-Point (PnP) solves for 3D gate pose relative to the drone, and a state machine sequences approach, transit, and re-acquisition phases. A controller converts desired trajectories into attitude commands sent over MAVLink to the PX4 autopilot.
The pipeline runs as a single async Python process. Every frame triggers the full chain — detect, estimate, decide, command — with no queuing or frame drops. Latency from camera capture to motor command is under 8 ms on target hardware.
The pipeline is fully synchronous within each frame. No thread pools, no message queues. The async event loop in race_pipeline.py awaits each stage sequentially, ensuring deterministic ordering. The only concurrent path is the telemetry listener, which runs in a background coroutine and updates shared state atomically.
| File | Role | Description |
|---|---|---|
race_pipeline.py |
core | Main orchestrator. Async race loop, state machine (SEEK / APPROACH / TRANSIT), gate sequencing, lap counter. Entry point for the race. |
vision_pipeline.py |
vision | VisionPipeline class. Dispatches to one of 3 detector backends (Color, YOLO, U-Net). Runs PnP estimator on detected corners. Returns gate pose + confidence. |
gate_segmentation.py |
vision | GateSegNet U-Net model definition. RANSAC edge-line fitting for sub-pixel corner extraction. Training loop with augmentation. |
race_config.py |
config | RaceConfig dataclass. Every tunable parameter in one place: gains, thresholds, timeouts, camera intrinsics. YAML serialization for reproducibility. |
trajectory_optimizer.py |
control | Quintic polynomial path planning. Computes racing lines through gates with minimum-snap trajectories. Time-optimal velocity profiling. |
drone_mpc_foundation.py |
control | DroneParams physical model, AttitudeMPC controller, GatePursuitController for proportional-derivative gate tracking. |
mavsdk_bridge.py |
comms | SimBridge class. MAVLink communication layer. Offboard control modes (attitude, velocity, position). Heartbeat management. |
camera_adapter.py |
input | Camera abstraction. Sources: Gazebo pipe (sim), video file (replay), synthetic (testing). Uniform frame interface. |
race_logger.py |
infra | JSONL race telemetry logging. Per-frame state snapshots for post-race analysis and debugging. |
rl_controller.py |
control | Gym environment wrapper, PPO training loop, NeuralController inference. Reinforcement learning alternative to PD controller. |
dashboard_server.py |
infra | Tornado WebSocket server. Streams telemetry to browser dashboard at 30 Hz. |
dashboard.html |
infra | Web dashboard with Three.js 3D scene, FPV camera feed, real-time telemetry gauges, lap timing. |
yolo-train.py |
vision | YOLOv8 training pipeline. Dataset management, hyperparameter config, TensorRT FP16 export for edge deployment. |
yolo-auto-label.py |
vision | Bootstrapping tool. Uses VQ1 color detection on known-color gates to generate YOLO bounding box labels automatically. |
The vision pipeline supports three interchangeable detector backends, selected via race_config.py. Each trades off between speed, robustness, and accuracy.
| Mode | ID | Method | Latency | Trade-off |
|---|---|---|---|---|
| Color | VQ1 | HSV threshold + contour | ~0.5 ms | Fastest. Requires highlighted/colored gates. Brittle under varying lighting. Good for sim bootstrapping. |
| YOLO | VQ2 | YOLOv8 neural network | ~12 ms | Handles complex backgrounds and partial occlusion. Needs labeled training data. Bounding box only (no corners). |
| U-Net | Primary | Pixel segmentation + RANSAC | ~5 ms | Best PnP accuracy. Pixel-level gate mask enables RANSAC edge fitting and line intersection for sub-pixel corners. |
The same codebase runs on three distinct platforms. All platform-specific differences are isolated in race_config.py — no conditional logic in the pipeline code.
| Connection | udpin://0.0.0.0:14540 |
| Camera | Synthetic / Gazebo pipe |
| Compute | Host machine (any GPU) |
| Use case | Development + VQ testing |
| Connection | Serial MAVLink (UART) |
| Camera | RPi Camera 3 Wide |
| Compute | Jetson Orin Nano 8GB |
| Use case | Real-world testing |
| Connection | TBD at event |
| Camera | Provided by Neros |
| Compute | Onboard (Neros spec) |
| Use case | Official competition |
SET_ATTITUDE_TARGET — roll, pitch, yaw rate, thrustSET_POSITION_TARGET_LOCAL_NED — x, y, z, vx, vy, vz| Parameter | Value |
|---|---|
| Protocol | MAVLink v2 |
| Transport | UDP (sim) / Serial UART (hardware) |
| Heartbeat | 2 Hz minimum for offboard mode (we send 4 Hz) |
| Command rate | 50-120 Hz (matches vision frame rate) |
| Telemetry rate | 120 Hz (position, velocity, attitude, IMU) |
# Offboard control flow
1. Connect to PX4 via UDP/serial
2. Start heartbeat at 4 Hz
3. Stream SET_ATTITUDE_TARGET at frame rate
4. PX4 enters offboard mode after ~0.5s of valid commands
5. Arm → takeoff → race loop → land
No GPS, LiDAR, or depth sensor. All distance estimation comes from PnP with 4 known gate corner positions in world coordinates. Gate dimensions (1.5 m x 1.5 m) are the only world-scale reference. This makes corner accuracy the single most important factor in the system.
Pixel-level segmentation mask enables RANSAC edge fitting along each gate side, then line-line intersection for corner extraction. This yields sub-pixel corner accuracy, which directly improves PnP reprojection error and therefore depth estimation. YOLO's bounding box corners are not precise enough for reliable PnP.
If the gate is lost (no detection for N frames), the state machine does not attempt to fly to a remembered position. Instead, it immediately enters SEEK and spins at 180 deg/s to re-acquire the gate visually. This is faster and more reliable than dead-reckoning to an estimated position without GPS.
Every gain, threshold, and timeout lives in race_config.py as a YAML-serializable dataclass. No magic numbers in pipeline code. Config can be swapped at runtime for different platforms, gate layouts, or tuning experiments. Every race log includes the full config snapshot for reproducibility.
| Parameter | Value | Context |
|---|---|---|
kp_yaw | 50 | Yaw proportional gain |
cruise_pitch | -25 deg | Nose-down cruise angle |
max_tilt | 70 deg | MonoRace uses 65+ |
seek_yaw_rate | 180 deg/s | Full rotation in 2s |
Gate pass-through is not determined by a simple distance threshold (which can false-trigger on approach). Instead, the system computes distance derivative over 3+ consecutive frames. Only when the distance is decreasing, below threshold, and the derivative confirms closing velocity does it trigger TRANSIT. This eliminates false transitions from PnP noise at close range.