Learning-based image coding has achieved competitive performance in terms of compression efficiency, while also gaining a key advantage in the ability to carry out computer vision tasks directly in the compressed domain. In fact, the latent representation which is generated using deep learning techniques may natively encapsulate all visual features needed for processing tasks, thereby eliminating the need to perform the expensive synthesis transform process at the decoder side. In this paper, it is proposed to perform face detection using the latent code present in the JPEG AI architecture. First, some experiments show how decoded images can be efficiently processed for face detection without retraining, albeit with some performance degradation. Then, for the first time a compressed domain RetinaFace-based detector applied to JPEG AI latent representations is competitively proposed. The performance achieved is comparable to the performance of the original RetinaFace applied to the reconstructed JPEG AI images, while reducing computational complexity since it bypasses the image decoding process. It is expected that this approach might be extended to other vision tasks since the JPEG AI representation format is not tailored specifically for any computer vision task.
JPEG AI Compressed Domain Face Detection
Alkhateeb, Ayman;Gnutti, Alessandro
;Guerrini, Fabrizio;Leonardi, Riccardo;
2024-01-01
Abstract
Learning-based image coding has achieved competitive performance in terms of compression efficiency, while also gaining a key advantage in the ability to carry out computer vision tasks directly in the compressed domain. In fact, the latent representation which is generated using deep learning techniques may natively encapsulate all visual features needed for processing tasks, thereby eliminating the need to perform the expensive synthesis transform process at the decoder side. In this paper, it is proposed to perform face detection using the latent code present in the JPEG AI architecture. First, some experiments show how decoded images can be efficiently processed for face detection without retraining, albeit with some performance degradation. Then, for the first time a compressed domain RetinaFace-based detector applied to JPEG AI latent representations is competitively proposed. The performance achieved is comparable to the performance of the original RetinaFace applied to the reconstructed JPEG AI images, while reducing computational complexity since it bypasses the image decoding process. It is expected that this approach might be extended to other vision tasks since the JPEG AI representation format is not tailored specifically for any computer vision task.File | Dimensione | Formato | |
---|---|---|---|
MMSP24__Compressed_domain_face_detector_.pdf
accesso aperto
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
1.36 MB
Formato
Adobe PDF
|
1.36 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.