A novel technique, which uses a joint audio-visual analysis for scene identification and characterization, is proposed. The paper defines four different scene types: dialogues, stories, actions, and generic scenes. It then explains how any audio-visual material can be decomposed into a series of scenes obeying the previous classification, by properly analyzing and then combining the underlying audio and visual information. A rule-based procedure is defined for such purpose. Before such rule-based decision can take place, a series of low-level pre-processing tasks are suggested to adequately measure audio and visual correlations. As far as visual information is concerned, it is proposed to measure the similarities between non-consecutive shots using a learning vector quantization approach. An outlook on a possible implementation strategy for the overall scene identification task is suggested, and validated through a series of experimental simulations on real audio-visual data

Identification of Story Units in Audio-Visual Sequences by Joint Audio and Video Processing

LEONARDI, Riccardo
1998-01-01

Abstract

A novel technique, which uses a joint audio-visual analysis for scene identification and characterization, is proposed. The paper defines four different scene types: dialogues, stories, actions, and generic scenes. It then explains how any audio-visual material can be decomposed into a series of scenes obeying the previous classification, by properly analyzing and then combining the underlying audio and visual information. A rule-based procedure is defined for such purpose. Before such rule-based decision can take place, a series of low-level pre-processing tasks are suggested to adequately measure audio and visual correlations. As far as visual information is concerned, it is proposed to measure the similarities between non-consecutive shots using a learning vector quantization approach. An outlook on a possible implementation strategy for the overall scene identification task is suggested, and validated through a series of experimental simulations on real audio-visual data
1998
0818688211
File in questo prodotto:
File Dimensione Formato  
SL_ICIP-1998_full-text.pdf

gestori archivio

Tipologia: Full Text
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 588.24 kB
Formato Adobe PDF
588.24 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/3821
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 37
  • ???jsp.display-item.citation.isi??? 10
social impact