Context
Fusion or combination of various information sources is a main problem in the machine learning community. This problem is especiallly important in signal processing and pattern recognition applications when more than one source of informations is available to the classifier. Even if the independece assumption does not hold and we know that the modeling/estimation error could be different for each stream the joint PDF is assumed to be obtained by multipling the PDFs. The practical solution to this problem is to use "stream weights in order to reduce the total classification error.
Compute those weights is not a trivial problem seeing that they are dependent of the application and even of the training and test conditions. One manner to compute those weights is done directly from the streams through their performances. The parameters of a given model can be reestimated as a function of the weights. In a second approach the system is adapted to a given context. Here the parameters are related to the reliability of the streams in a given enviromental conditions based, in general, on the SNR. In a similar manner, the enviromental conditions can be estimated by the performance of the system in different models.
Some of the algorithms presented in the previous paragraph have problems in real applications. For example some of them use external information, which is not presented in the input signals, or need and extra database registered in the same test conditions to train their systems. In almost all the works the parameters, weights included, are computed in the training phase using a held out database in a supervised manner. In real conditions, specially when the training data does not reflect the characteristics of the test data, an unsupervised approach could improve the system performance. Therefore the main goal in our work is to compute the optimal stream weights for the multi-stream classification problem in an unsupervised manner.
Overview
Based on the assumption that the modeling/estimation error for the feature PDFs is a random variable the deviation of the decision boundary from the optimal Bayes boundary is also a random variable that we assume is a zero-mean Gaussian variable. The classification decision is then a function of that random variable. The classification error function can not be minimized directly, but an aproximation is to compute the weights that minimize the variance of the decision boundary deviation given by the variance of the random variable. Actually, it can be noticed that stream weights may reduce estimation error only when either the PDF estimation error of the single stream classifiers are different and/or the Bayes error of the single stream classifiers are dfferent. If the two streams have the same informativeness, equal Bayes classification error, the stream weights are inversaly proportional to the sum of the variances of the PDF estimation error for each of the classrs of that given stream :
Figure 1: Representation, in two dimensions, of the two classes classification problem. Each axis represents one stream.
- Provide initial centroids, from the actual models, for the k-means,
- Perform k-means using only the test data,
- Compute inter- and intra-class distances,
- Estimate final stream weights.
Figure 2: Practical stream weights estimation process.
In this manner the proposed method employs only the information contained in the trained models (which can be trained only with clean data) and requieres a single utterance to compute the stream weights. The proposed method achives comparable performance with the supervised minimum error estimation of the weights.
Applications
- Audio-Visual Speech Classification
- Audio-Visual Speech Recognition
Projects
Contributors
Eduardo Sánchez Soto
Khalid Daoudi (contact)
Main Publications
- A. Potamianos, E. Sánchez-Soto and K. Daoudi. Stream Weight Computation for Multi-Stream Classifiers. ICASSP' 06, Toulouse (France), May 2006.
-
E. Sánchez-Soto, A. Potamianos and K. Daoudi. Unsupervised Stream Weight Computation Using Antimodels
ICASSP' 07, Hawaii (USA), April 2007.
-
E. Sánchez-Soto, K. Daoudi and A. Potamianos. Unsupervised Stream Weight Computation in a Segmentation Task: Application to Audio-Visual Speech Recognition IEEE International Conference on Signal Processing and Communication, ICSPC 2007, Dubai, United Arab Emirates, November 2007.