Javier Romero González-Nicolás

Binocular Hand Tracking and Pose Estimation


Hand gestures are important in human communication. It is difficult to transmit visual or spatial concepts only with oral communication (communication human to human)

or by regular user interfaces(communication human to machine): keyboard, mouse, etc.

The systems of hand gestures recognition are nowadays mainly oriented to the recognition of predefined sets of gestures. This is useful for applications like

virtual mouses guided by the fingertips, sign language recognition, etc.

In the context of Programming by Demonstration, these sets of predefined gestures are not enough, since a system should be able to learn any kind of “regular” gesture.

The goal of this thesis is to recover the hand pose from a stereo vision system. Two main problems are faced in this thesis: what features should we extract from the vision system and how can we extract from these features the hand configuration.

The feature used in this thesis is the location in 3D of fingertips, and this information is translated into the hand pose configuration by a closed form inverse kinematics solution.