The original annotation (left) contains several errors that are unrelated to the actual objects.
This paper addresses the problem of multi-object tracking in Unmanned Aerial Vehicle (UAV) footage. It plays a critical role in various UAV applications, including traffic monitoring systems and real-time suspect tracking by the police. However, this task is highly challenging due to the fast and unpredictable motion of UAVs, as well as the small size of target objects in the videos caused by the high-altitude and wide-angle views of drones. In this study, we introduce a novel method to overcome these challenges. Our approach involves a new tracking strategy, which initiates the tracking of target objects from low-confidence detections commonly encountered in UAV application scenarios. Additionally, we propose revisiting traditional appearance-based matching algorithms to improve the association of low-confidence detections. To evaluate the effectiveness of our method, we conducted benchmark evaluations on two UAV-specific datasets (VisDrone2019, UAVDT) and a general dataset (MOT17). The results demonstrate that our approach surpasses current state-of-the-art methodologies, highlighting its robustness and adaptability in diverse tracking environments. Furthermore, we have improved the annotation of the UAVDT dataset by rectifying several errors and addressing omissions found in the original annotations. We will provide this refined version of the dataset to facilitate better benchmarking in the field.
To address the difficulties in UAV footage, we propose a new approach for starting tracklets from low-confidence detections, which is especially useful in UAV scenarios. To handle numerous low-confidence detections effectively, we combine traditional appearance matching algorithms like color histogram and scaled image mean squared error. These traditional algorithms tend to be more reliable compared to deep learning methods, especially in challenging conditions such as UAV footage. Moreover, we utilize UAV motion compensation to account for the rapid and unpredictable motion of the UAV.
Our color histogram similarity and our scaled image similarity reached at 94.4%, 70.3% respectively, significantly higher than the DeepSort Re-ID similarity 64.1%.
Upon meticulous examination of the UAVDT dataset annotations, we discovered several errors and omissions. This refinement process resulted in our "Refined UAVDT" dataset which encompasses an additional 43,981 annotations, thereby increasing the total object count from 340,906 to 384,887. Moreover, additional 55 tracks have been incorporated into this refined dataset. We believe that this improved dataset will serve as a more reliable foundation for future research in the field. The original annotations are shown on the left, and the refined annotations are shown on the right.
The original annotation (left) contains several errors that are unrelated to the actual objects.
The refined annotation (right) complements the omissions by adding appropriate annotations.
The track results of ByteTrack (left) and SFTrack (right).
This video demonstrates the effective handling of rapid UAV motion by our SFTrack algorithm.
The substantial decrease in ID switches (IDs), coupled with an increase in IDF1, clearly substantiates our claim.
Method | MOTA ↑ | IDF1 ↑ | IDs ↓ | MT ↑ | ML ↓ |
---|---|---|---|---|---|
ByteTrack (ECCV 2022) | 47.9 | 45.1 | 220 | 46 | 41 |
SFTrack (Ours) | 57.9 | 70.9 | 40 | 56 | 33 |
Our SFTrack ourperforms ByteTrack in tracking small-scale objects that are far from the UAV. This is evident from the doubled number of objects that are Mostly Tracked (MT) and the halved number of objects that are Mostly Lost (ML).
Method | MOTA ↑ | IDF1 ↑ | IDs ↓ | MT ↑ | ML ↓ |
---|---|---|---|---|---|
ByteTrack (ECCV 2022) | 43.6 | 46.9 | 143 | 16 | 37 |
SFTrack (Ours) | 54.5 | 57.6 | 95 | 38 | 20 |
The track results of our SFTrack.
These videos show that our approach performs well in challenging environments, such as low-light or hazy conditions.
Our SFTrack tracks many objects accurately.
@InProceedings{Song_2024_IROS,
author = {Song, Inpyo and Lee, Jangwon},
title = {SFTrack: A Robust Scale and Motion Adaptive Algorithm for Tracking Small and Fast Moving Objects},
booktitle = {Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
month = {October},
year = {2024}
}