1 Introduction

There has been a lot of interest lately in the use of hand gestures for human-computer interfacing: they are intuitive for the operator, and provide a rich source of information to the machine. This type of interface is particularly appropriate in applications such as virtual reality, multimedia and teleoperation [1, 2, 3].

Most current commercial implementations rely on sensors that are physically attached to the hand, such as the `DataGlove' [4]. More recently, systems have been proposed using vision to observe the hand. Some require special gloves with attachments or markings to facilitate the localization and tracking of hand parts [5, 6], but others operate without intrusive hardware. This is attractive because it is convenient for the user and potentially cheaper to implement.

Here we present an experimental implementation of such a system, concentrating in particular on the case of pointing at a distant object. We have developed a simple vision-based pointing system as an input device for a robot manipulator, to provide a novel and convenient means for the operator to specify points for pick-and-place operations. We use active contour techniques to track a hand in a pointing gesture, with conventional monochrome cameras and fairly modest computing hardware.

A single view of a pointing hand is ambiguous: its distance from the camera cannot be determined, and the `slant' of its orientation cannot be measured with any accuracy. Stereo views are used to recover the hand's position and orientation, and yield the line along which the index finger is pointing. In our system, we assume that the user is pointing towards a `target surface,' which is a horizontal plane. We show how a simple result from projective geometry can be applied to this case, allowing the system to be implemented with uncalibrated stereo, that requires no measurements or special assumptions about camera positions and parameters.


  • Next
  • Contents