Recent progress of service robotics gradually expands the application domain of robotics from manufacturing settings to domestic environment. Since it is now impossible to engineer the environment, the ability of robust perception is one of the key components of a robotic system. This paper considers the problem of visual perception and its use for manipulation of objects in everyday settings. The process of object manipulation involves all aspects of detection/recognition, servoing to the object, alignment and grasping. Each of these processes have typically been considered independently or in relatively simple environments. Given a task at hand together with its constraints, it is, however, possible to provide a system that exhibits robustness in a realistic setting.
An important skill in terms of manipulation is the estimation of the three dimensional position and orientation of the object from an image. Due to the large number of topologically distinct aspects of an object, many of the techniques based on computing the correspondence between the image and model features , (shape based methods), fail to achieve real-time performance. In addition, many of the everyday objects are highly textured and it is therefore difficult to use simple features like edges and corners to solve the correspondence problem.
A more natural approach in terms of computational efficiency is the use of appearance based methods  for providing the rough initial estimate followed by a refinement step using, for example, model based tracking , . A similar approach is also considered here.
The paper starts with a motivation and related work in Section where some of the similar systems are presented. We outline the basic differences between these systems and the approach we have taken motivated mainly by the task at hand. Section gives a more detailed overview of the system with initialization and pose estimation steps. In Section some of the applications are presented showing three different applications of the system. Finally, the overall approach and the associated results are discussed in Section .