The detector has never seen a stone archway, an industrial fan, or a doorway — yet VQ2's 3D-scanned realistic environments are full of them. Unless we explicitly train against those distractors, the detector will produce false positives that send the drone chasing ghosts. mine_hardneg.py feeds the detector exactly the images it's going to fire incorrectly on, tags the high-confidence false positives as high-value training samples, and saves them with empty YOLO labels so the next training run learns "nope, NOT a gate."
--keyword "industrial arch"dataset_gates_hardneg_v2/| Subcommand | What it does |
|---|---|
harvest | Download ~N images for a keyword via Google + Bing image search (icrawler). Writes to harvested/<slug>/. |
mine | Run current detector on an input dir, save images as YOLO negatives (empty labels). Records FP triggers in false_positives.json. |
stats | Dataset counts + top FP sources + hardest negatives (highest-confidence wrong detections). |
# Download ~200 images (Google + Bing) for one visual concept
./aigp/Scripts/python.exe mine_hardneg.py harvest \
--keyword "industrial scaffolding" --num 200
Output: harvested/industrial_scaffolding/ with 100-200 jpg/png files. Inspect the folder — if any images actually contain gates (rare but possible with generic queries), delete them manually before mining.
./aigp/Scripts/python.exe mine_hardneg.py mine \
--input-dir harvested/industrial_scaffolding
The tool runs your current detector (models/apex_yolo11n.pt) at conf=0.25 and saves any image that triggered a detection. These are the "hard" cases — visually similar enough to gates that your detector fires incorrectly. They go to dataset_gates_hardneg_v2/ with empty labels.
Each keyword adds diversity. Run harvest + mine for each class below (battle-tested to trigger gate FPs):
| Keyword | Why it confuses the detector |
|---|---|
stone archway | Rectangular opening, high contrast edges |
warehouse doorway | Frame-within-frame structure, often metal |
circular exhaust fan | Round opening with internal structure (VQ1 "highlighted" gates) |
picture frame wall | Pure rectangle prior → bbox false positives |
rectangular window frame | Same as above plus reflective glass artefacts |
tunnel entrance | Dark rectangular opening framed by light edges |
clock tower face | Circular high-contrast disk |
industrial scaffolding | Lots of rectangular framing elements |
./aigp/Scripts/python.exe mine_hardneg.py stats
Shows the top-N highest-confidence false positives. If the detector is firing at conf ≥ 0.85 on something, that's exactly what you want in training — the hardest examples produce the biggest gradient signal.
./aigp/Scripts/python.exe train_apex.py detector \
--dataset dataset_gates_hardneg_v2 --epochs 40
Start from your existing models/apex_yolo11n.pt (train_apex.py resumes from prior best by default) and fine-tune for 30-50 epochs. Anything more risks catastrophic forgetting on the positive data.
Better: merge positives + negatives into a combined training run. Edit dataset_gates_mega/data.yaml to include the hardneg dir, or use a mix config (pattern already in dataset_gates_domain/).
harvest| Flag | Default | What it does |
|---|---|---|
--keyword | required | Search term (quote multi-word phrases) |
--num | 200 | Total images across Google + Bing |
mine| Flag | Default | What it does |
|---|---|---|
--input-dir | required | Dir of non-gate images to process |
--model | models/apex_yolo11n.pt | Detector to use for FP discovery |
--conf | 0.25 | Conf threshold — anything above = false positive |
--imgsz | 640 | Detector input size |
--all-negatives | false | Include images with no detections (plain backgrounds) |
--max-images | 2000 | Cap total saved per run |
| Pitfall | Symptom | Fix |
|---|---|---|
| Query returns actual gates | e.g. "metal gate" → real racing gates in harvest | Audit the harvest dir; delete gate-containing images BEFORE running mine. Empty labels on real gates would hurt recall. |
| Too few FPs | fp_saved = 2 after 200 images | Your detector is already robust to that class. Either move on (good sign) or drop --conf 0.10 to surface lower-confidence FPs. |
| Thumbnail-sized harvest | Skipped count high | Tool rejects < 320 px min-dim. Try different keywords — some Google results are tiny. |
| Near-dup clustering | Same image saved in 5 colors | dHash dedup already handles most; if still happening, re-mine with a tighter conf or manually delete. |
| Catastrophic forgetting | After fine-tune, positive recall drops | Don't fine-tune on hardneg alone for many epochs. Mix with positives at 80:20 (positive:negative) ratio, or limit to 30 epochs. |