Figure-Ground Segmentation Using Multiple Cues

Peter Nordlund

This phd-thesis consists of two parts, one summary and six included papers. This electronically available version only contains the summary. The included papers can be also be obtained, see links below. Some of the included papers exist with different titles, but in the links below the papers are named according to what they are called throughout the thesis.

The full version of the thesis including the papers A-F is available as Technical Report ISRN KTH/NA/P-98/05-SE, from KTH, S-100 44 Stockholm, Sweden.

Abstract

The theme of this thesis is figure-ground segmentation. We address the problem in the context of a visual observer, e.g. a mobile robot, moving around in the world and capable of shifting its gaze to and fixating on objects in its environment. We are only considering bottom-up processes, how the system can detect and segment out objects because they stand out from their immediate background in some feature dimension. Since that implies that the distinguishing cues can not be predicted, but depend on the scene, the system must rely on multiple cues. The integrated use of multiple cues forms a major theme of the thesis. In particular, we note that an observer in our real environment has access to 3-D cues. Inspired by psychophysical findings about human vision we try to demonstrate their effectiveness in figure-ground segmentation and grouping also in machine vision.

An important aspect of the thesis is that the problems are addressed from a systems perspective: it is the performance of the entire system that is important, not that of component algorithms. Hence, we regard the processes as part of perception-action cycles and investigate approaches that can be implemented for real-time performance.

The thesis begins with a general discussion on the problem of figure-ground segmentation and thereafter the issue of attention is discussed. Experiments showing some implementations of attentional mechanisms with emphasis on real-time performance are presented. We also provide experimental results on closed-loop control of a head-eye system pursuing a moving object. A system integrating motion detection, segmentation based on motion and segmentation based on stereo is also presented. Maintenance of an already achieved figure-ground segmentation is discussed. We demonstrate how an initially obtained figure-ground segmentation can be maintained by switching to another cue when the initial one disappears. The use of multiple cues is exemplified by a method of segmenting a 2-D histogram using a multi-scale approach. This method is further simplified to suit our real-time performance restrictions. Throughout the thesis the importance of having systems with a capacity of operating continuously on images coming directly from cameras is stressed, thus we prove that our systems consist of a complete processing chain, with no links missing, which is essential when designing working systems.

Thesis, complete with the included papers, PDF, 9.1 Mb

Thesis without the included papers, PDF, 1.4 Mb

The included papers:

Paper A: P. Nordlund and T. Uhlin, ``Closing the loop: Detection and pursuit of a moving object by a moving observer'', Image and Vision Computing, vol. 14, pp. 265-275, May 1996. (PDF 1.4M)

Paper B: T. Uhlin, P. Nordlund, A. Maki, and J.-O. Eklundh, ``Towards an active visual observer'', in Proc. 5th International Conference on Computer Vision, (Cambridge, MA), pp. 679-686, June 1995. (PDF 670k)

Paper C: T. Uhlin, P. Nordlund, A. Maki, and J.-O. Eklundh, ``Towards an active visual observer'', Tech. Rep. ISRN KTH/NA/P--95/08--SE, Mar. 1995. Shortened version in Proc. 5th International Conference on Computer Vision pp 679-686. (PDF 1.5M)

Paper D: A. Maki, P. Nordlund and J.-O. Eklundh ``Attentional Scene Segmentation: Integrating Depth and Motion from Phase'' Extended version of tech report ISRN KTH/NA/P--96/05--SE, Computer Vision and Image Understanding, vol. 78, pp. 351--373, June 2000. (PDF 1.1M)

Paper E: P. Nordlund and J.-O. Eklundh, ``Towards a seeing agent'', in First Int. Workshop on Cooperative Distributed Vision, Kyoto, Japan, pp. 93-123. (PDF 3.1M)

Paper F: P. Nordlund, ``Real-time maintenance of figure-ground segmentaion'', Tech. Rep. ISRN KTH/NA/P--98/04--SE, May 1998. (PDF 1.0M)


Due to an error in the generation of postscript in the program "idraw" all of the above electronicly available articles has been regenerated in August 2003.
The original faulty postscript articles are available upon request.


Peter Nordlund (petern@nada.kth.se)