The position and orientation of the camera in relation to the subject(s) in a movie scene, namely camera “level” and camera “angle”, are essential features in the film-making process due to their influence on the viewer's perception of the scene. In this paper, we propose the use of Convolutional Neural Networks (CNNs) for the automatic recognition of camera angles (categorized into five classes: Overhead, High, Neutral, Low, and Dutch) and camera levels (categorized into Aerial, Eye, Shoulder, Hip, Knee, and Ground) in movie frames. Our approach demonstrates remarkable effectiveness even when frames do not prominently feature the human figure. The training, validation, and test datasets are composed of frames sampled from an unprecedented variety of movie shots, freely available images, and labeled frames from cinematographic websites, for a total of over 24,000 images. Classification results for both camera angle and level achieve a weighted average precision and recall above 95%. To foster further research in domains such as movie stylistic analysis, video recommendation, and media psychology, we provide the developed models, annotation tool, and frame data through our project page at https://cinescale.github.io/.

Recognition of Camera Angle and Camera Level in Movies from Single Frames

Savardi, Mattia;Signoroni, Alberto;Benini, Sergio
2023-01-01

Abstract

The position and orientation of the camera in relation to the subject(s) in a movie scene, namely camera “level” and camera “angle”, are essential features in the film-making process due to their influence on the viewer's perception of the scene. In this paper, we propose the use of Convolutional Neural Networks (CNNs) for the automatic recognition of camera angles (categorized into five classes: Overhead, High, Neutral, Low, and Dutch) and camera levels (categorized into Aerial, Eye, Shoulder, Hip, Knee, and Ground) in movie frames. Our approach demonstrates remarkable effectiveness even when frames do not prominently feature the human figure. The training, validation, and test datasets are composed of frames sampled from an unprecedented variety of movie shots, freely available images, and labeled frames from cinematographic websites, for a total of over 24,000 images. Classification results for both camera angle and level achieve a weighted average precision and recall above 95%. To foster further research in domains such as movie stylistic analysis, video recommendation, and media psychology, we provide the developed models, annotation tool, and frame data through our project page at https://cinescale.github.io/.
2023
9798400708459
File in questo prodotto:
File Dimensione Formato  
savardi-wiced-2023-small.pdf

solo utenti autorizzati

Licenza: DRM non definito
Dimensione 2.14 MB
Formato Adobe PDF
2.14 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/588949
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact