next up previous
Next: Experiment 2. Up: Experimental Evaluation Previous: Experimental Evaluation

Experiment 1.

Here, the effect of weighting was evaluated. The results are obtained for 10 sequences and for each sequence 3 different sizes of the window of attention were used: 25$ \times$25, 35$ \times$35 and 45$ \times$45 pixels. The reason for this was to test the ability of the system to cope with the background clutter: if the target is small compared to the size of the window of attention, a large portion of the window will belong to the background and therefore the content of the window will change and affect the response of each cue. The target undergoes arbitrary 3D motion.

Figure: A comparison between ground truth, voting approaches and individual cues during a person tracking.
\epsfig{figure=pat5.ps,width=.5\textwidth} \epsfig{figure=pat6.ps,width=.5\textwidth} \includegraphics[width=.5\textwidth]{patricX.eps} \includegraphics[width=.5\textwidth]{patricY.eps} \includegraphics[width=.5\textwidth]{patricC2X.eps} \includegraphics[width=.5\textwidth]{patricC2Y.eps}
  RF Voting AF Voting Color diff SSD
  x y x y x y x y x y
mean -0.7 0.1 0.2 2 -1.3 2.5 -2.7 31.1 3 -11
std 2.4 2.5 2.5 2.6 4.8 4.5 17.3 25 13.2 11


Accuracy (Table [*]) - Here, the distance measure is used as an error indicator. The overall results are presented in Table [*] for the proposed fusion approaches. The results show that the best accuracy is achieved with fixed weights using the texture based weighting and the uniform weighting. The one-step distance weighting gives a reward to a cue each time when the cue performs satisfactorily and there is no ability to determine the overall performance of the cues during the sequence. It was expected that this problem would be solved using the history based weighting but, on the other hand, temporal smoothing results in a slow weight assignment dynamics. One solution to this problem might be to change the model and instead of using all frames up to the current one, apply a temporal windowing approach. This would allow the use of the immediate history to evaluate the performance of each cue.

Comparing the performance for fusion approaches shows that action fusion approach had higher standard deviation (14 pixels for texture based weighting). The reason for this is the choice of the underlying voting space. For example, if the color cue shows a stable performance for a number of frames, its weight will be high compared to the other cues (or it might have been set to a high value from the beginning). In some cases, two colors are used at the same time. When an occlusion occurs, the position of the center of the mass of the color blob will change fast (and sometimes in different directions) which results in abrupt changes in both direction and speed. The other method, response fusion, on the other hand, does not suffer from this which results in a lower standard deviation value.

In many cases it is, however, more important to retain the tracking at the cost of a lower accuracy. For that purpose the reliability measure is important.

Figure: A comparison between ground truth, voting approaches and individual cues in case of occlusions.
\epsfig{figure=oout2.ps,width=.5\textwidth} \epsfig{figure=oout4.ps,width=.5\textwidth} \includegraphics[width=.5\textwidth]{russin3X.eps} \includegraphics[width=.5\textwidth]{russin3Y.eps} \includegraphics[width=.5\textwidth]{russinC3X.eps} \includegraphics[width=.5\textwidth]{russinC3Y.eps}
RF Voting AF Voting Color Diff SSD
x y x y x y x y x y
mean 1.5 -6.4 1.5 -1.7 -24 28 3.2 2.5 14.9 -2.4
std 2 1.9 4 3 19.4 20.8 8.3 9.5 22 16


Table: Qualitative results (pixels) for 30 sequences and all weighting techniques.
Uniform Texture One-step Dist. Hist. Based
Weights Weighting Weighting Weighting
mse std mse std mse std mse std
RF 10 9 7 6 17 14 17 14
AF 9 14 8 14 13 14 15 14



Reliability (Table [*]) - Here, the influence of choice of the weight assignment technique on the success rate of the response and action fusion approaches is discussed. As for the accuracy, the reliability was estimated for 30 test runs and the percentage of the success is presented. Ranking the results shows that the texture based weighting performed most reliably - the target was successfully tracked during 27 test runs.

Comparing the overall results, texture weighting approach resulted in both the highest accuracy and reliability. Uniform weighting, although very accurate according to the results in Table [*], performed worst in terms of reliability.

Table: The influence of the weight assignment techniques on the success rate.
  Uniform Texture One-step Dist. Hist. Based
  Weights Weighting Weighting Weighting
RF 76.7 % 90 % 83.3 % 80 %
AF 43.3 % 73.3 % 66.7 % 66.7 %



next up previous
Next: Experiment 2. Up: Experimental Evaluation Previous: Experimental Evaluation
Danica Kragic 2002-12-06