In this paper, we present a smart hand gesture recognition experimental set up for collaborative robots using a Faster R-CNN object detector to find the accurate position of the hands in the RGB images taken from a Kinect v2 camera. We used MATLAB to code the detector and a purposely designed function for the prediction phase, necessary for detecting static gestures in the way we have defined them. We performed a number of experiments with different datasets to evaluate the performances of the model in different situations: A basic hand gestures dataset with four gestures performed by the combination of both hands, a dataset where the actors wear skin-like color clothes while performing the gestures, a dataset where the actors wear light-blue gloves and a dataset similar to the first one but with the camera placed close to the operator. The same tests have been conducted in a situation where also the face of the operator was detected by the algorithm, in order to improve the prediction accuracy. Our experiments show that the best model accuracy and Fl-Score are achieved by the complete model without the face detection. We tested the model in real-Time, achieving good performances that can lead to real-Time human-robot interaction, being the inference time around 0.2 seconds.

Deep Learning Based Machine Vision: First Steps Towards a Hand Gesture Recognition Set Up for Collaborative Robots

NUZZI, CRISTINA
;
Pasinetti, Simone;Lancini, Matteo;Docchio, Franco;Sansoni, Giovanna
2018-01-01

Abstract

In this paper, we present a smart hand gesture recognition experimental set up for collaborative robots using a Faster R-CNN object detector to find the accurate position of the hands in the RGB images taken from a Kinect v2 camera. We used MATLAB to code the detector and a purposely designed function for the prediction phase, necessary for detecting static gestures in the way we have defined them. We performed a number of experiments with different datasets to evaluate the performances of the model in different situations: A basic hand gestures dataset with four gestures performed by the combination of both hands, a dataset where the actors wear skin-like color clothes while performing the gestures, a dataset where the actors wear light-blue gloves and a dataset similar to the first one but with the camera placed close to the operator. The same tests have been conducted in a situation where also the face of the operator was detected by the algorithm, in order to improve the prediction accuracy. Our experiments show that the best model accuracy and Fl-Score are achieved by the complete model without the face detection. We tested the model in real-Time, achieving good performances that can lead to real-Time human-robot interaction, being the inference time around 0.2 seconds.
2018
9781538624975
File in questo prodotto:
File Dimensione Formato  
08439044.pdf

solo utenti autorizzati

Tipologia: Full Text
Licenza: DRM non definito
Dimensione 2.56 MB
Formato Adobe PDF
2.56 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/510656
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 22
  • ???jsp.display-item.citation.isi??? 11
social impact