Skip to main content


Table 5 Semi-supervised testing and validating

From: Bi-linear matrix-variate analyses, integrative hypothesis tests, and case-control studies

Issues Description
Issue-1 Estimate the parameters by semi-supervised learning on the training set, from which we get the corresponding p-value p and a classifier. Using this classifier on the training set and the testing set, it follows from Equation (44) that we get \(\varepsilon _{C}^{tr}\) and \(\varepsilon _{C}^{te}\). This is what we traditionally get.
Issue-2 Lump the training samples and testing samples together, and estimate the parameters by semi-supervised learning on the lumped set, we also get the corresponding \(\tilde {p}\), \(\tilde {\varepsilon }_{C}^{tr}\) and \(\tilde {\varepsilon }_{C}^{te}\).
Issue-3 \(\tilde {p}\) is actually more reliable than p because testing samples are used for regularising parameter estimation. This \(\tilde {p}\) is also different from the traditional compounded p-value because the label information of testing samples have not been compounded.
Issue-4 Without using the label information of testing samples, \(\tilde {\varepsilon }_{C}^{te}\) shares the concept same as \(\varepsilon _{C}^{te}\), but is actually more reliable because of regularization.
Issue-5 Merging the training set and testing set to get a big training set and treating the validating set as a new testing set, which actually extends this procedure to improve the validation.