Further advances on Bayesian Ying-Yang harmony learning

Applied Informatics

Table 2 Related studies: KL- η -HL spectrum

Year	Outcomes
1998	The following convex combination with 0≤η≤1 is heuristically proposed (1−η)K L(p(Y\|X)p(X)∥q(Y\|R)q(Y))−η H(θ),(A) as a criterion for model selection, e.g. see Eq. (49) in Xu (1998a) and Eq. (22) in Xu (1998b). The above equation (A) can be rewritten into a format that is exactly equivalent to H _L(θ)=(1+η)H(θ)+η E _Y\|X in Equation 17.
2000	It is further proposed to make maxθ H _L(θ) with η>0 monotonically decreased from a big value (i.e. remove the constraint η≤1), see Eq. (23) in Xu (2000a), which is further addressed for learning Gaussian mixture in Xu (2001a), e.g. see paragraphs around its Eq. (42) and Eq. (43).
2003	The above equation (A) has been also reexamined from a perspective of the KL- η-HL spectrum, with details referred to Eqs. (62-64) in (Xu 2003a).

Remarks.
(a) This family is further investigated in 2012 from a perspective of the Yang structure, see Sect. 3.4.2 in Xu (2012a) and especially the parts around its Eq.(46) on a family of the Yang structures. Each of such structures corresponds an inverse of Ying machine in a range from superBayes (η>0) to Bayes (η=0).
(b) What was discussed in Xu (2012a) is actually a range that also includes a subBayes inverse of Ying machine coming from (η<0), that is, superBayes →Bayes→subBayes.
(c) The symbol η was actually λ in the above mentioned studies.
(d) The concept of superBayes versus subBayes may be understood from Equation 16. The two factors of q(X|Y,θ)q(Y|θ) are mutually linear for Bayes, superlinear for superBayes and sublinear for subBayes. \(\hfill \square \)