Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
12-2003
Abstract
This paper presents an automatic and novel approach in structuring and indexing lecture videos for distance learning applications. By structuring video content, we can support both topic indexing and semantic querying of multimedia documents. In this paper, our aim is to link the discussion topics extracted from the electronic slides with their associated video and audio segments. Two major techniques in our proposed approach include video text analysis and speech recognition. Initially, a video is partitioned into shots based on slide transitions. For each shot, the embedded video texts are detected, reconstructed and segmented as high-resolution foreground texts for commercial OCR recognition. The recognized texts can then be matched with their associated slides for video indexing. Meanwhile, both phrases (title) and keywords (content) are also extracted from the electronic slides to spot the speech signals. The spotted phrases and keywords are further utilized as queries to retrieve the most similar slide for speech indexing.
Discipline
Computer Sciences | Graphics and Human Computer Interfaces
Research Areas
Intelligent Systems and Optimization
Publication
Proceedings of 5th IEEE International Symposium on Multimedia Software Engineering, ISMSE 2003, Taichung, Taiwan, December 10-12
First Page
215
Last Page
222
ISBN
9780769520315
Identifier
10.1109/MMSE.2003.1254444
Publisher
IEEE
City or Country
Taichung
Citation
NGO, Chong-wah; WANG, Feng; and PONG, Ting-Chuen.
Structuring lecture videos for distance learning applications. (2003). Proceedings of 5th IEEE International Symposium on Multimedia Software Engineering, ISMSE 2003, Taichung, Taiwan, December 10-12. 215-222.
Available at: https://ink.library.smu.edu.sg/sis_research/6609
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.