Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

10-2024

Abstract

Drone-based crowd tracking faces difficulties in accurately identifying and monitoring objects from an aerial perspective, largely due to their small size and close proximity to each other, which complicates both localization and tracking. To address these challenges, we present the Density-aware Tracking (DenseTrack) framework. DenseTrack capitalizes on crowd counting to precisely determine object locations, blending visual and motion cues to improve the tracking of small-scale objects. It specifically addresses the problem of cross-frame motion to enhance tracking accuracy and dependability. DenseTrack employs crowd density estimates as anchors for exact object localization within video frames. These estimates are merged with motion and position information from the tracking network, with motion offsets serving as key tracking cues. Moreover, DenseTrack enhances the ability to distinguish small-scale objects using insights from the visual-language model, integrating appearance with motion cues. The framework utilizes the Hungarian algorithm to ensure the accurate matching of individuals across frames. Demonstrated on DroneCrowd dataset, our approach exhibits superior performance, confirming its effectiveness in scenarios captured by drones.

Keywords

Multi-object Tracking, Crowd Localization, Vision-language Pre-training, Motion-appearance Fusion

Discipline

Computer Sciences | Graphics and Human Computer Interfaces

Research Areas

Data Science and Engineering; Software and Cyber-Physical Systems

Publication

Proceedings of ACM International Conference on Multimedia (ACM MM 2024) : Melbourne, Australia, October 28 - November 1

First Page

2050

Last Page

2058

Identifier

10.1145/3664647.3680617

Publisher

Association for Computing Machinery

City or Country

Melbourne, Australia

Share

COinS