
87
Ended
On this page
Approach and landing account for a disproportionate share of aviation accidents — the phase where altitude margin is gone, speed is managed to margins, and the pilot's ability to correctly identify the runway determines what happens next. Computer vision has the potential to extend what the flight deck camera sees and classify it reliably: runway or not runway, in conditions where human perception degrades. This Airbus-backed competition challenges participants to build a runway detection deep learning model on real aerial approach imagery — with code running on the data inside the tracebloc workspace, and data never leaving the infrastructure.
Approach and landing account for approximately 50% of fatal aviation accidents despite representing a small fraction of total flight time. During the final approach phase, pilots must continuously verify the runway environment through varying light, weather, haze, and approach geometries — often under high cognitive load at low altitude. A vision-based system that reliably classifies a runway in the camera feed could augment pilot situational awareness and reduce the decision latency that contributes to approach-and-landing incidents.
Building that system is not straightforward. Runway detection deep learning models must generalise across a wide range of conditions: daytime and dusk approaches, haze and crosswind angle, contaminated surfaces, and variable approach slopes. The classification boundary — runway vs. non-runway — sounds binary, but the visual ambiguity in degraded conditions is precisely where the model has to hold. A false negative, where the system fails to identify a runway that is present, is the more costly error class: it either provides no support at the moment the pilot needs it, or — in an autonomous context — leads to a missed landing opportunity with limited go-around margin.
The technical challenge is therefore not just accuracy on a clean test set. It is recall under distribution shift: variable illumination, approach angle variation, and real-world atmospheric conditions that no laboratory benchmark fully captures.
Airbus sponsors this challenge to accelerate the development of reliable runway classification models for aviation safety applications. The sponsor's interest spans both pilot decision support AI for commercial operations and autonomous flight computer vision for future aircraft programmes. Airbus brings domain expertise, curated imagery from the approach and landing phase, and credibility as the primary safety case framing the evaluation criteria.
Participants submit code — not model weights and not predictions — to the tracebloc workspace. The code runs on the runway imagery dataset inside the infrastructure. Training, inference, and evaluation happen within the workspace environment; the imagery is never transferred to participant systems. Scoring is based on classification accuracy on the held-out test set, with false negative rate tracked separately given its asymmetric cost in aviation safety contexts.
The competition is open to individual researchers, academic teams, and industry practitioners working in aerospace ML, computer vision for safety-critical systems, or autonomous flight. Results are published to the live leaderboard as submissions are evaluated.
The dataset consists of aerial imagery captured during the aircraft approach and landing phase, annotated for runway classification. Images represent a range of real-world approach conditions: varied time of day, atmospheric visibility, runway surface states, and approach angles across multiple airport environments. The annotation task is binary at the image level — runway present or absent — with bounding region annotations available for localisation tasks where the model architecture supports them.
The dataset is structured to reflect the conditions where runway identification is most operationally relevant: approaches where visual cues are partial, occluded, or degraded. Clean, high-visibility frames with large runway footprints are included, but so are the edge cases — low sun angle, light haze, oblique approach geometry — that correspond to the scenarios where pilot decision support AI would provide the most value.
Because code runs on the data inside the tracebloc workspace, participants interact with the dataset through the training environment rather than downloading it. This is consistent with how vision-based landing assistance systems are deployed operationally: the model must perform in the environment where the data lives, not on a local copy. Full dataset statistics, class distributions, image condition breakdowns, and sample visualisations are available in the Exploratory Data Analysis tab.
The primary evaluation metric is classification accuracy on the held-out test set — the proportion of approach-phase frames where the model correctly identifies whether a runway is present. Accuracy is reported as the headline leaderboard metric, but the evaluation tracks false negative rate and false positive rate separately.
False negatives — frames where a runway is present but the model fails to classify it as such — are weighted more heavily in the qualitative assessment than false positives. In the pilot decision support context, a missed runway identification provides no augmentation to the pilot at the moment it is needed. In an autonomous flight context, a false negative during final approach is a safety-relevant failure. A model that achieves strong overall accuracy by suppressing false positives while tolerating elevated false negative rates is not a viable approach for this application, and the evaluation commentary reflects that.
A strong result means high recall on the runway-present class across the full range of approach conditions in the test set — including the degraded-visibility and oblique-angle frames that represent the operationally hard cases.
Participants submit code to the tracebloc workspace via the training and fine-tuning tab. The submitted code runs inside the workspace environment on the runway imagery dataset. Training, evaluation, and scoring are executed within the infrastructure; no imagery is transferred outside it. This mirrors how a federated learning application operates: computation moves to where the data is, rather than the data moving to where the computation is.
After evaluation completes, results are published automatically to the leaderboard. Participants can resubmit updated code to improve their ranking as the competition runs. The submission interface, environment specifications, and framework compatibility requirements are documented in the training tab.
The leaderboard is live and updates with each evaluated submission. Current rankings, accuracy scores, false negative rates, and per-condition breakdowns are available in the leaderboard tab. Rankings reflect performance on the held-out test set; no participant has access to test labels during submission.
The aerial runway imagery dataset used in this competition is provided for research and competition purposes. Airbus sponsorship of this challenge reflects the sponsor's interest in advancing runway classification and pilot decision support AI research, and does not constitute endorsement of any specific participant submission or model for operational deployment in certified aircraft systems. Approach and landing scenarios represented in the dataset are illustrative of real-world conditions; they do not correspond to specific incidents, airports, or operational events. Model performance on this dataset should not be interpreted as certification-relevant evidence of airworthiness or operational readiness under applicable aviation authority regulations.