Using Content Analysis for Video Compression

Benini, Sergio; Bianchetti, A.; Leonardi, Riccardo

This paper suggests the idea to model video information as a concatenation of different recurring sources. For each source a different tailored compressed representation can be optimally designed so as to best match the intrinsic characteristics of the viewed scene. Since in a video, a shot or scene with similar visual content recurs more than once, even at distant intervals in time, this enables to build a more compact representation of information. In a specific implementation of this idea, we suggest a content-based approach to structure video sequences into hierarchical summaries, and have each such summary represented by a tailored set of dictionaries of codewords. Vector quantization techniques, formerly employed for compression purposes only, have been here used ﬁrst to represent the visual content of video shots and then to exploit visual-content redundancy inside the video. The depth in the hierarchy determines the precision in the representation both from a structural point of view and from a quality level in reproducing the video sequence. The eﬀectiveness of the proposed method is demonstrated by preliminary tests performed on a limited collection of video-data excerpted from a feature movie. Some additional functionalities such as video skimming may remarkably benefit from this type of representation.