FOSS4G 2022 general tracks

HIECTOR: Hierarchical object detector for cost-efficient detection at scale
2022-08-24, 14:45–15:15 (Europe/Rome), Auditorium

Object detection, classification and semantic segmentation are ubiquitous and fundamental tasks in extracting, interpreting and understanding the information acquired by satellite imagery. Applications for locating and classifying man-made objects, such as buildings, roads, aeroplanes, and cars typically require Very High Resolution (VHR) imagery, with spatial resolution ranging approximately from 0.3 m to 5 m. However, such VHR imagery is generally proprietary and commercially available at a high cost. This prevents its uptake from the wider community, in particular when analysis at large scale is desired. HIECTOR (HIErarchical deteCTOR) tackles the problem of efficiently scaling object detection in satellite imagery to large areas by leveraging the sparsity of such objects over the considered area-of-interest (AOI). This talk presents a hierarchical method for detection of man-made objects, using multiple satellite image sources with different Ground Sample Distance (GSD). The detection is carried out in a hierarchical fashion, starting at the lowest resolution and proceeding to the highest. Detections at each stage of the pyramid are used to request imagery and apply the detection at the next higher resolution, therefore reducing the amount of data required and processed. We evaluate HIECTOR for the task of building detection for a middle-eastern country, estimating oriented bounding boxes around each object of interest.

For the detection of buildings, HIECTOR is demonstrated using the following data sources: Sentinel-2 imagery with 10 m GSD, Airbus SPOT imagery pan-sharpened to 1.5 m pixel size and Airbus Pleiades imagery pan-sharpened to 0.5 m pixel size. Sentinel-2 imagery is openly available, making their use very cost efficient. The Single-Stage Rotation-Decoupled Detector (SSRDD) algorithm is used. Given that single buildings are not discernible at 10 m GSD, a bounding box does not describe a single building but rather a cluster of buildings. The estimated bounding boxes at 10 m are joined and the resulting polygon area is used to further request SPOT imagery at the pan-sharpened pixels size of 1.5 m. In the case of SPOT imagery, given the higher spatial resolution, one bounding box is estimated for each building. As a final step, predictions are improved in areas with low confidence by requesting Airbus Pleiades imagery at the pan-sharpened 0.5 m pixel size. Ablation studies show that HIECTOR achieves a mean Average Precision (mAP) score of 0.383 and 20-fold reduction in costs compared to using only VHR at the highest resolution, which achieves a mAP of 0.452.

Code will be released under MIT license. We will also release the trained models on Sentinel, SPOT and Pleiades imagery. In addition, manually labelled building footprints over Dakar will be open-sourced to allow users evaluate the generalisation of the models over different geographical areas. The Sentinel Hub service is used by HIECTOR to request the commercial imagery sources on the specified polygons determined at each level of the pyramid, allowing to request, access and process specific sub-parts of the AOI.

Data Scientist at Sinergise developing open-source tools and applications leveraging satellite imagery.