Skip to main content

Table 1 Recent BYY applications and empirical studies

From: Further advances on Bayesian Ying-Yang harmony learning

Papers Outcomes
Shi et al. (2011a) A comparative investigation has been made on three Bayesian related approaches, namely, variational Bayesian (VB), minimum message length (MML) and BYY harmony learning, through the task of learning Gaussian mixture model (GMM) with an appropriate number of components automatically determined. On not only simulated GMM data sets but also the Berkeley segmentation database of real world images, extensive experiments have shown that BYY harmony learning considerably outperforms both MML and VB regardless whether a Jeffreys prior or a conjugate Dirichlet-Normal-Wishart (DNW) prior is used and whether the hyper-parameters of DNW prior are further optimised.
Tu and Xu (2011a) A further comparison has been made on factor analysis (FA) with an appropriate number of factors determined, and extensive experiments have shown that not only BYY and VB outperform AIC, BIC and DNLL but also BYY outperforms VB considerably. Moreover, using VB to optimise the hyper-parameters of priors deteriorates the performances while using BYY for this purpose can improve the performances.
Tu and Xu (2011b) Empirical comparisons have also been made on factor selection performances of AIC, BIC, Bozdogan’s AIC, Hannan-Quinn criterion, Minka’s (MK) criterion, Kritchman & Nadler’s hypothesis tests (KN), Perry & Wolfe’s MiniMax rank (MM) and BYY harmony learning, by varying signal-to-noise ratio (SNR) and training sample size N. It has been shown that AIC and BYY harmony learning, as well as MK, KN and MM, are relatively more robust than the others against decreasing N and SNR, and BYY is superior for a small size N.
Shi et al. (2014); Tu and Xu (2014) Extension of FA has been made to binary FA with automatic factor selection. Again, it is empirically shown that BYY outperforms VB and BIC. Also, efforts of (Shi et al. 2014) extend the studies of (Shi et al. 2011a) and two FA parameterizations in (Tu and Xu 2011a) into Mixture of Factor Analyzers (MFA) and Local Factor Analysis (LFA) for the problem of automatically determining the component number and the number of factors of each FA. On not only a wide range of synthetic experiments but also real applications of face recognition, handwritten digit image clustering and unsupervised image segmentation, it has been also shown that BYY outperforms VB reliably on both MFA and LFA.
Chen et al. (2014) Further developments of (Shi et al. 2011a) have also been made to avoid some learning instability (see Remarks at the bottom of this table), an implementation of BYY harmony learning by either a projection-embedded algorithm or the algorithm by Table ?? in this paper needs no priori but outperforms not only MML with Jeffreys prior and VB with Dirichlet-Normal-Wishart prior but also BYY with these priors given in (Shi et al. 2011a). On the Berkeley segmentation data set, the semantic image segmentation performances have shown that BYY outperforms not only MML, VB, BYY-Jef and BYY-DNW but also three leading image segmentation algorithms, namely gPb-owt-ucm, MN-Cut and Mean Shift.
  1. Remarks.
  2. For the first three items above, the BYY harmony learning is implemented via one of two techniques as follows:
  3. (a) Gradient-based local search that needs a small step size to be pre-specified. If this step size is too small, learning is too slow and easy to get stuck at a local optimal solution. If this step size is too big, learning becomes unstable.
  4. (b) Ying-Yang nonlocal search that consists of an expectation-maximisation (EM) like two steps, with no learning stepsize but a correcting δ in E step. For GMM, it follows from Eq. (11) in (Xu L 2010a) that E step of the EM algorithm that allocates x t to the th Gaussian by p(|x t ,θ old) is replaced by p(|x t ,θ old)+δ(θ old) with an approximation that may cause learning instability, also see Equations 88 and 89 for details. $\hfill \square $