An efficient video shot representation for fast video retrieval

Cheng Cai, Kin Man Lam, Zheng Tan

Research output: Journal article publicationConference articleAcademic researchpeer-review

3 Citations (Scopus)


For video retrieval, a video is partitioned into a group of shots, which are then represented by either key frames or video shot representations. An optimal representation of a shot should include all the information about the frames concerned. In this paper, we propose an efficient representation scheme for a shot, which considers both the spatial frequency contents and the temporal statistics of the frames for video retrieval. In our scheme, each frame in a video shot is transformed into the frequency domain using the discrete cosine transform (DCT), and a number of values at each frequency are selected based on their probability of occurrence. This representation scheme allows retrieval to be carried out hierarchically, i.e. from low-frequency to high-frequency components. Experimental results show that our proposed scheme outperforms the alpha-trimmed average histogram method in terms of retrieval accuracy.
Original languageEnglish
Pages (from-to)230-238
Number of pages9
JournalProceedings of SPIE - The International Society for Optical Engineering
Issue number1
Publication statusPublished - 1 Dec 2005
EventVisual Communications and Image Processing 2005 - Beijing, China
Duration: 12 Jul 200515 Jul 2005


  • Content-Based Video Retrieval
  • Video Indexing
  • Video Shot Representation

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Condensed Matter Physics


Dive into the research topics of 'An efficient video shot representation for fast video retrieval'. Together they form a unique fingerprint.

Cite this