The async race loop: how one frame turns into one command. Six states, deterministic transitions, sub-50 ms latency budget. Same state machine serves VQ1 (PID controller) and VQ2 (PPO controller).
race_pipeline.py| State | Enter when | Exit when | Controller behavior |
|---|---|---|---|
| IDLE | Start of run, armed but not flying | First gate detected | Zero throttle, neutral attitude |
| SEEK | No gate detected for N frames | Gate reacquired | Slow forward + gentle yaw sweep |
| APPROACH | Gate detected, >2.5m away | Gate within transit threshold | PID to gate center (heading + alt) |
| TRANSIT | Distance < 2.5m, derivative closing | Distance increasing (passed through) | Committed: low yaw, pitch held |
| RECOVER | Post-crash (collision or altitude excursion) | Level attitude regained, in-frame gate | Level-off maneuver, throttle to hover |
| DONE | Last gate passed (finish line) | Run end | Zero throttle, descend to hover |
async def race_tick(frame_bgr, telemetry, state):
# 1) Perception (8 ms budget)
detections = await detector.infer(frame_bgr) # ~5 ms
target = select_target(detections, state.prev_gate_id)
if target:
corners = await keypoints.infer(frame_bgr, target.bbox) # ~3 ms
gate_cam = solve_pnp_square(GATE_3D, corners, K)
# Refine with SAMD if window allows
if samd.has_estimate():
gate_cam = samd.get_depth_estimate()
# Fuse with IGPP
state.igpp.vision_update(gate_cam, target.confidence)
# 2) State transition (logic-only)
state.transition(target, gate_cam, telemetry)
# 3) Command generation (1 ms)
if state.is_ppo_mode: # VQ2
obs = build_obs(target, gate_cam, telemetry, state.last_action)
cmd = ppo_policy.predict(obs)
else: # VQ1
cmd = pid_pilot.step(gate_cam, telemetry)
# 4) Log + return (0.2 ms)
state.log.write(frame_bgr, detections, gate_cam, cmd)
return cmd, state
Gate pass-through is not a simple distance threshold — that false-triggers on approach. The condition is:
close_enough = dist < 2.5
closing = (prev_dist - dist) > 0.1 # meter per frame
committed = close_enough AND closing across 3+ consecutive frames
transit_done = dist increasing after committed (gate behind)
This eliminates false transitions from PnP noise at close range.
| Stage | Budget | Actual (RTX 5080 dev) |
|---|---|---|
| Detector (Phase 1) | 5 ms | ~5 ms PT, <3 ms TRT |
| Keypoints (Phase 2) | 3 ms | ~3 ms |
| PnP solve | 0.5 ms | ~0.3 ms |
| SAMD refine (when applicable) | 0.6 ms | ~0.6 ms |
| IGPP update | 0.1 ms | ~0.05 ms |
| State transition | 0.1 ms | ~0.01 ms |
| Controller (PID or PPO) | 0.5 ms | ~0.2 ms |
| Logging (async) | 0.2 ms | ~0.1 ms off-path |
| Total | <10 ms | ~9 ms |
Every tick writes a JSONL line with:
{"t": 0.142, "frame_id": 14,
"detections": [{"bbox": [.32,.41,.18,.22], "conf": 0.91, "kp": [...]}],
"gate_cam_m": [1.2, 0.0, 5.6],
"telemetry": {"q": [...], "omega": [...], "accel": [...]},
"cmd": {"throttle": 0.55, "roll": 0.02, "pitch": 0.01, "yaw": -0.04},
"state": "APPROACH"}
Frames themselves are saved separately (PNG, lossless) on every tick. The JSONL is the ground truth for offline analysis and dataset growth. See playbook §03 for the data-pipeline story.