From: Bi-linear matrix-variate analyses, integrative hypothesis tests, and case-control studies
Issues | Description |
---|---|
Issue-1 | Estimate the parameters by semi-supervised learning on the training set, from which we get the corresponding p-value p and a classifier. Using this classifier on the training set and the testing set, it follows from Equation (44) that we get \(\varepsilon _{C}^{tr}\) and \(\varepsilon _{C}^{te}\). This is what we traditionally get. |
Issue-2 | Lump the training samples and testing samples together, and estimate the parameters by semi-supervised learning on the lumped set, we also get the corresponding \(\tilde {p}\), \(\tilde {\varepsilon }_{C}^{tr}\) and \(\tilde {\varepsilon }_{C}^{te}\). |
Issue-3 | \(\tilde {p}\) is actually more reliable than p because testing samples are used for regularising parameter estimation. This \(\tilde {p}\) is also different from the traditional compounded p-value because the label information of testing samples have not been compounded. |
Issue-4 | Without using the label information of testing samples, \(\tilde {\varepsilon }_{C}^{te}\) shares the concept same as \(\varepsilon _{C}^{te}\), but is actually more reliable because of regularization. |
Issue-5 | Merging the training set and testing set to get a big training set and treating the validating set as a new testing set, which actually extends this procedure to improve the validation. |