Computer Vision Based Human-Computer Interaction
- a collaboration between
ScopeWith the development of information technology in our society, we can expect that computer systems to a larger extent will be embedded into our environment. These environments will impose needs for new types of human-computer-interaction, with interfaces that are natural and easy to use. In particular, the ability to interact with computerized equipment without need for special external equipment is attractive.
Today, the keyboard, the mouse and the remote control are used as the main interfaces for transferring information and commands to computerized equipment. In some applications involving three-dimensional information, such as visualization, computer games and control of robots, other interfaces based on trackballs, joysticks and datagloves are being used. In our daily life, however, we humans use our vision and hearing as main sources of information about our environment. Therefore, one may ask to what extent it would be possible to develop computerized equipment able to communicate with humans in a similar way, by understanding visual and auditive input.
The purpose of this project is to develop new perceptual interfaces for human-computer-interaction based on visual input captured by computer vision systems, and to investigate how such interfaces can complement or replace traditional interfaces based on keyboards, mouses, remote controls, data gloves or speech. Examples of applications of gesture recognition include:
The project combines CVAPs expertise in computer vision with CIDs experience in designing and evaluating of new human-machine interfaces. Initially, the focus is on developing algorithms for recognizing hand gestures and to build prototype systems that make it possible to test perceptual interfaces in practical applications. An important component of the work is to perform continuous user studies in close connection with the development work.
It may be worth emphasizing that that the aim is not to recognize the kind of expressive gestures that are tightly coupled to our speech, or sign languages aimed at inter-human communication. The goal is to explore hand gestures suitable for various control tasks in human-machine interaction. Multi-modal interfaces including hand gesture recognition, face and gaze tracking and speech recognition will also be considered.
Detailed information (click on an image to see a videoclip demo; on a header for technical information)
PressA prototype hand gesture recognition system constructed within this project was presented at the Swedish IT-fair Connect 2001 at Älvsjömässan, Stockholm, Sweden, April 23-25, 2001. View the news reports about our gesture control system at www.idg.se, in Expressen and Dagens IT.
References (click on a title to fetch the corresponding text)
S. Lenman, L. Bretzner, and B. Thuresson, ``Using marking menus to develop command sets for computer vision based hand gesture interfaces'', in Second Nordic Conference on Human-Computer Interaction, NordiCHI02, (Aarhus, Denmark), pp. --, Oct. 2002. (pdf)
L. Bretzner, I. Laptev and T. Lindeberg, ``Hand gesture recognition using multi-scale colour features, hierarchical models and particle filtering'', in Proc. 5th IEEE International Conference on Automatic Face and Gesture Recognition Washington D.C, May 2002. (pdf)
L. Bretzner and T. Lindeberg, ``Qualitative multi-scale feature hierarchies for object tracking'', in Proc. 2nd International Conference on Scale-Space Theories in Computer Vision (O. F. Olsen M. Nielsen, P. Johansen and J. Weickert, eds.), vol. 1682, (Corfu, Greece), pp. 117--128, Springer Verlag, Sept. 1999. Lecture Notes in Computer Science, (Extended version available as Tech. Rep. ISRN KTH/NA/P--99/09--SE). (PostScript)
L. Bretzner and T. Lindeberg, ``Use your hand as a 3-D mouse or relative orientation from extended sequences of sparse point and line correspondances using the affine trifocal tensor'', in Proc. 5th European Conference on Computer Vision (H. Burkhardt and B. Neumann, eds.), vol. 1406 of Lecture Notes in Computer Science, (Freiburg, Germany), pp. 141--157, Springer Verlag, Berlin, June 1998. (PostScript)
L. Bretzner, I. Laptev, T. Lindeberg, S. Lenman, Y. Sundblad, ``A prototype system for computer vision based human computer interaction Technical report ISRN KTH/NA/P-01/09-SE, April 2001.
I. Laptev and T. Lindeberg, "Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features", Technical report ISRN KTH/NA/P-00/12-SE, September 2000. Shortened version in IEEE Workshop on Scale-Space and Morphology, Vancouver, Canada, July 2001, Springer Verlag Lecture Notes in Computer Science.
I. Laptev and T. Lindeberg, "A multi-scale feature likelihood map for direct evaluation of object hypotheses", Technical report ISRN KTH/NA/P-01/03-SE, March 2001. Shortened version in IEEE Workshop on Scale-Space and Morphology, Vancouver, Canada, July 2001, Springer Verlag Lecture Notes in Computer Science.
T. Lindeberg and L. Bretzner ``Method and arrangement for controlling means for three-dimensional transfer of information by motion detection'', International patent application PCT/SE1999/000402, 1999 (now released).
AcknowledgementsThis work has been made possible by support from the Swedish National Board for Industrial and Technical Development, NUTEK, and the Swedish Research Council for Engineering Sciences, TFR.