Abstract
For video retrieval, a video is partitioned into a group of shots, which are then represented by either key frames or video shot representations. An optimal representation of a shot should include all the information about the frames concerned. In this paper, we propose an efficient representation scheme for a shot, which considers both the spatial frequency contents and the temporal statistics of the frames for video retrieval. In our scheme, each frame in a video shot is transformed into the frequency domain using the discrete cosine transform (DCT), and a number of values at each frequency are selected based on their probability of occurrence. This representation scheme allows retrieval to be carried out hierarchically, i.e. from low-frequency to high-frequency components. Experimental results show that our proposed scheme outperforms the alpha-trimmed average histogram method in terms of retrieval accuracy.
Original language | English |
---|---|
Pages (from-to) | 230-238 |
Number of pages | 9 |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 5960 |
Issue number | 1 |
Publication status | Published - 1 Dec 2005 |
Event | Visual Communications and Image Processing 2005 - Beijing, China Duration: 12 Jul 2005 → 15 Jul 2005 |
Keywords
- Content-Based Video Retrieval
- Video Indexing
- Video Shot Representation
ASJC Scopus subject areas
- Electrical and Electronic Engineering
- Condensed Matter Physics