
Drone Object Detection for Real-Time Crowd and Traffic Surveillance
Participants
10
End Date
31.01.27
Dataset
docbeb26
Resources2 CPU (8.59 GB) | 1 GPU (22.49 GB)
Compute
0 / 0 F
Submits
0/5
Overview

10
31.01.27
Tracebloc is a tool for benchmarking AI models on private data. This Playbook breaks down how a team used tracebloc to benchmark AI models on their drone footage and discovered which model truly delivered the best results. Find out more on our website or schedule a call with the founder directly.
Every inaccurate decision costs money and safety, but not every model holds up under stress condistions. Using tracebloc, a drone analytics company uncovered which UAV object detection model truly performs under pressure, saving over €3 million a year.
Following several large-scale public events and natural disasters, city authorities are turning to drone crowd monitoring for better eyes in the sky. Juliane Weber, Head of Operations at a UAV image analysis company, is developing an AI-powered drone computer vision platform for drone traffic monitoring and drone crowd monitoring. The goal: detect people, vehicles, and emergency units instantly, even in smoke, low light, or dense crowds.
With a dataset of 7 000 aerial images and 350 000 annotations, her team decided to benchmark external UAV deep learning vendors to find which model actually performed best.
Each vendor submitted drone object detection models, optimized for deployment on embedded drone hardware (e.g. NVIDIA Jetson Orin NX).
Vendors were asked to state overall object detection performance as well as per-class F1 scores for rare object classes (e.g. wheelchair user, police car, fire truck). Robustness under occlusion and crowd density was emphasized, as was edge inference latency under 20 ms.
| VENDOR AND MODEL TYPE | CLAIMED OVERALL RECALL AT 90% PRECISION (IoU≥0,5) | RARE CLASS F1 (avg over 4 rarest object classes) | INFERENCE LATENCY |
| A - YOLOv9 | 93,5% | 78,2% | 16 ms |
| B - RT-DETR | 95,1% | 81,9% | 18 ms |
| C - YOLOv8 | 91,4% | 72,5% | 12 ms |
| D - DINOv2 | 94,2% | 79,4% | 19 ms |
While all vendors claimed high recall on common object classes (e.g. car, pedestrian, person), Juliane’s team focused their assessment on:
Using tracebloc, she set up an evaluation environment on isolated edge AI hardware. Vendors never saw the raw data yet could run their object detection models directly on real aerial footage from the company’s test set of 2 000 annotated images. In the next phase, they fine-tuned their models on an additional 5 000 training images and re-evaluated performance.
After fine-tuning, performance varied significantly especially on rare object classes. Recall was measured at 90% precision with IoU ≥ 0,50:
| VENDOR | CLAIMED RECALL | BASELINE RECALL | RECALL AFTER FINE-TUNING | RARE CLASS F1 (Post-Tuning) |
| A | 93,5% | 88,1% | 91,3% | 75,4% |
| B ✅ | 95,1% | 89,7% | 94,5% | 80,6% |
| C | 91,4% | 86,2% | 89,0% | 71,3% |
| D | 94,2% | 88,9% | 95,1% | 60,5% |
Vendor B`s RT-DETR transformer model delivered the most balanced performance across common and rare object classes, with the second highest overall recall post-fine-tuning and rare class F1 above 80%. Others struggled to close the gap on infrequent objects. Vendor D’s DINOv2 model neglected rare object classes to boost overall baseline recall and hence was not considered further. All vendors met latency requirements.
Every percentage point of improved detection reduces chaos on the ground. Drone reconaissance helps command units respond faster and avoid costly mistakes on the ground.
Estimated annual cost of misallocations based on overall recall at 90% precision and IoU ≥ 0,50:
| VENDOR | RECALL AFTER FINE-TUNING | MISALLOCATION RATE | MISALLOCATION / YEAR | ESTIMATED COST |
| A | 91,3% | 8,7% | 8.700 | €8,7m |
| B✅ | 94,5% | 5,5% | 5.500 | €5,5m |
| C | 89,0% | 11,0% | 11.000 | €11,0m |
Vendor B offers the best trade-off between a high recall 94,5% and strong rare object detection at F1>80,6%. The saving potential is €3,2m p.a. compared to the next best model, highlighting the importance of strong model performance for drone reconnaissance.
Disclaimer:
The persona, figures, performance metrics, and financial assumptions in this case study are fictional and simplified to reflect realistic industry logic. This case is designed to illustrate AI benchmarking and does not reflect actual vendor performance or contractual outcomes.
The RT-DETR model delivers the best balance of latency, precision, and cost by combining transformer accuracy with real-time efficiency optimized for embedded drone hardware.
What is tracebloc?
tracebloc is a tool for benchmarking third party AI models on your own proprietary data. Find out more on the website or schedule a call with us directly. Click "Join use case" if you would like to try it yourself and explore the docs for technical details.