In this paper, we present a smart hand gesture recognition experimental set up for collaborative robots using a Faster R-CNN object detector to find the accurate position of the hands in the RGB images taken from a Kinect v2 camera. We used MATLAB to code the detector and a purposely designed function for the prediction phase, necessary for detecting static gestures in the way we have defined them. We performed a number of experiments with different datasets to evaluate the performances of the model in different situations: A basic hand gestures dataset with four gestures performed by the combination of both hands, a dataset where the actors wear skin-like color clothes while performing the gestures, a dataset where the actors wear light-blue gloves and a dataset similar to the first one but with the camera placed close to the operator. The same tests have been conducted in a situation where also the face of the operator was detected by the algorithm, in order to improve the prediction accuracy. Our experiments show that the best model accuracy and Fl-Score are achieved by the complete model without the face detection. We tested the model in real-Time, achieving good performances that can lead to real-Time human-robot interaction, being the inference time around 0.2 seconds.
Deep Learning Based Machine Vision: First Steps Towards a Hand Gesture Recognition Set Up for Collaborative Robots
NUZZI, CRISTINA
;Pasinetti, Simone;Lancini, Matteo;Docchio, Franco;Sansoni, Giovanna
2018-01-01
Abstract
In this paper, we present a smart hand gesture recognition experimental set up for collaborative robots using a Faster R-CNN object detector to find the accurate position of the hands in the RGB images taken from a Kinect v2 camera. We used MATLAB to code the detector and a purposely designed function for the prediction phase, necessary for detecting static gestures in the way we have defined them. We performed a number of experiments with different datasets to evaluate the performances of the model in different situations: A basic hand gestures dataset with four gestures performed by the combination of both hands, a dataset where the actors wear skin-like color clothes while performing the gestures, a dataset where the actors wear light-blue gloves and a dataset similar to the first one but with the camera placed close to the operator. The same tests have been conducted in a situation where also the face of the operator was detected by the algorithm, in order to improve the prediction accuracy. Our experiments show that the best model accuracy and Fl-Score are achieved by the complete model without the face detection. We tested the model in real-Time, achieving good performances that can lead to real-Time human-robot interaction, being the inference time around 0.2 seconds.File | Dimensione | Formato | |
---|---|---|---|
08439044.pdf
solo utenti autorizzati
Tipologia:
Full Text
Licenza:
DRM non definito
Dimensione
2.56 MB
Formato
Adobe PDF
|
2.56 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.