The position and orientation of the camera in relation to the subject(s) in a movie scene, namely camera “level” and camera “angle”, are essential features in the film-making process due to their influence on the viewer's perception of the scene. In this paper, we propose the use of Convolutional Neural Networks (CNNs) for the automatic recognition of camera angles (categorized into five classes: Overhead, High, Neutral, Low, and Dutch) and camera levels (categorized into Aerial, Eye, Shoulder, Hip, Knee, and Ground) in movie frames. Our approach demonstrates remarkable effectiveness even when frames do not prominently feature the human figure. The training, validation, and test datasets are composed of frames sampled from an unprecedented variety of movie shots, freely available images, and labeled frames from cinematographic websites, for a total of over 24,000 images. Classification results for both camera angle and level achieve a weighted average precision and recall above 95%. To foster further research in domains such as movie stylistic analysis, video recommendation, and media psychology, we provide the developed models, annotation tool, and frame data through our project page at https://cinescale.github.io/.
Recognition of Camera Angle and Camera Level in Movies from Single Frames
Savardi, Mattia;Signoroni, Alberto;Benini, Sergio
2023-01-01
Abstract
The position and orientation of the camera in relation to the subject(s) in a movie scene, namely camera “level” and camera “angle”, are essential features in the film-making process due to their influence on the viewer's perception of the scene. In this paper, we propose the use of Convolutional Neural Networks (CNNs) for the automatic recognition of camera angles (categorized into five classes: Overhead, High, Neutral, Low, and Dutch) and camera levels (categorized into Aerial, Eye, Shoulder, Hip, Knee, and Ground) in movie frames. Our approach demonstrates remarkable effectiveness even when frames do not prominently feature the human figure. The training, validation, and test datasets are composed of frames sampled from an unprecedented variety of movie shots, freely available images, and labeled frames from cinematographic websites, for a total of over 24,000 images. Classification results for both camera angle and level achieve a weighted average precision and recall above 95%. To foster further research in domains such as movie stylistic analysis, video recommendation, and media psychology, we provide the developed models, annotation tool, and frame data through our project page at https://cinescale.github.io/.File | Dimensione | Formato | |
---|---|---|---|
savardi-wiced-2023-small.pdf
solo utenti autorizzati
Licenza:
DRM non definito
Dimensione
2.14 MB
Formato
Adobe PDF
|
2.14 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.