We propose two approaches where voting is used for: i) response fusion, and ii) action fusion. The first approach makes the use of ``raw'' responses from the employed visual cues in the image which also represents the action space, $ \textbf{A}$. Here, the response is represented either by a binary function (yes/no) answer, or in the interval [0,1] (these values are scaled between [0,255] to allow visual monitoring). The second approach uses a different action space represented by a direction and a speed, see Fig. [*]. Compared to the first approach, where the position of the tracked region is estimated, this approach can be viewed as estimating its velocity. Again, each cue votes for different actions from the action space, $ \textbf{A}$, which is now the velocity space.

Figure: Action fusion approach: the desired direction is (down and left) with a (slow) speed.


Danica Kragic 2002-12-06