Publication Type
Journal Article
Version
acceptedVersion
Publication Date
2-2007
Abstract
Due to the prevalence of digital video camcorders, home videos have become an important part of life-logs of personal experiences. To enable efficient video parsing, a critical step is to automatically extract objects, events and scene characteristics present in videos. This paper addresses the problem of extracting objects from home videos. Automatic detection of objects is a classical yet difficult vision problem, particularly for videos with complex scenes and unrestricted domains. Compared with edited and surveillant videos, home videos captured in uncontrolled environment are usually coupled with several notable features such as shaking artifacts, irregular motions, and arbitrary settings. These characteristics have actually prohibited the effective parsing of semantic video content using conventional vision analysis. In this paper, we propose a new approach to automatically locate multiple objects in home videos, by taking into account of how and when to initialize objects. Previous approaches mostly consider the problem of how but not when due to the efficiency or real-time requirements. In home-video indexing, online processing is optional. By considering when, some difficult problems can be alleviated, and most importantly, enlightens the possibility of parsing semantic video objects. In our proposed approach, the how part is formulated as an object detection and association problem, while the when part is a saliency measurement to determine the best few locations to start multiple object initialization.
Discipline
Computer Sciences | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Publication
IEEE Transactions on Multimedia
Volume
9
Issue
2
First Page
268
Last Page
279
ISSN
1520-9210
Identifier
10.1109/TMM.2006.887992
Publisher
Institute of Electrical and Electronics Engineers
Citation
1
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.