 Research
 Open Access
 Published:
On the translationinvariance of image distance metric
Applied Informatics volume 2, Article number: 11 (2015)
Abstract
An appropriate choice of the distance metric is a fundamental problem in pattern recognition, machine learning and cluster analysis. Some methods that based on the distance of samples, e.g, the kmeans clustering algorithm and the knearest neighbor classifier, are crucially relied on the performance of the distance metric. In this paper, the property of translation invariance for the distance metric of images is especially emphasized. The consideration is twofold. Firstly, some of the commonly used distance metrics, such as the Euclidean and Minkowski distance, are independent of the training set and/or the domainspecific knowledge. Secondly, the translation invariance is a necessary property for any intuitively reasonable image metric. The image Euclidean distance (IMED) and generalized Euclidean distance (GED) are image metrics that take the spatial relationship between pixels into consideration. Sun et al.(IEEE Conference on Computer Vision and Pattern Recognition, pp 1398–1405, 2009) showed that IMED is equivalent to a translationinvariant transform and proposed a metric learning algorithm based on the equivalency. In this paper, we provide a complete treatment on this topic and extend the equivalency to the discrete frequency domain. Based on the connection, we show that GED and IMED can be implemented as lowpass filters, which reduce the space and time complexities significantly. The transform domain metric learning proposed in (Sun et al. 2009) is also resembled as a translationinvariant counterpart of LDA. Experimental results demonstrate improvements in algorithm efficiency and performance boosts on the small sample size problems.
Background
The distance measure of images plays a central role in computer vision and pattern recognition, which can be either learned from a training set, or specified according to a priori domainspecific knowledge. The problem of metric learning, has gained considerable interest in recent years (Hastie and Tibshirani 1996; Xing et al. 2003; Hertz and Pavel 2002; BarHillel et al. 2003; Goldberger et al. 2005; ShalevShwartz et al. 2004; Chopra et al. 2005; Globerson et al. 2006; Weinberger et al. 2005; Lebanon 2006; Davis et al. 2007; Li et al. 2007). On the other hand, the fact that the standard Euclidean distance assumes that pixels are spatially independent yields counterintuitive results, e.g, a perceptually large distortion can produce smaller distance (Jean 1990; Wang et al. 2005). By incorporating the spatial correlation of pixels, two classes of image metrics, namely IMED (Wang et al. 2005) and GED (Jean 1990), were designed to deal with the spatial dependencies for image distances, which were demonstrated consistent performance improvements in many real world problems (Jean 1990; Wang et al. 2005; Chen et al. 2006; Wang et al. 2006; Zhu et al. 2007).
A key advantage of GED and IMED is that they can be embedded in any classification technique. The calculation of IMED is equivalent to performing a linear transform called the standardizing transform (ST) and then followed by the traditional Euclidean distance. Hence, feeding the STtransformed images to a recognition algorithm automatically embeds IMED (Wang et al. 2005). The analogous transform for GED is referred as to the generalized Euclidean transform (GET) (Jean 1990).
IMED and GED are invariant to image translation, namely, if the same image translation is applied to two images, their IMED remains invariant. However, the associated transforms (ST and GET) are not translation invariant (TI). This left a problem whether IMED can be implemented by a TI transform. In (Sun et al. 2009), the authors gave a positive answer to the problem and provided a proof for simple cases, yet a few technical problems are left unresolved.
We should emphasize the importance of the translation invariances. Intuitively, as the relative distance between images should only depend on the relative position of them, translation invariance (TI) should be a fundamental requirement for any reasonable image metric. Yet few metric learning or linear subspace methods are aware of the TI property when dealing with images.
In this paper, we extend the theory in (Sun et al. 2009) to the discrete frequency domain to cover the practical cases. Based on the metrictransform connection, we show that both GED and IMED are essentially lowpass filters. The resulting filters lead to the fast implementations of GED and IMED, coinciding the algorithm proposed in (Sun et al. 2008), which reduces the space and time complexities significantly. The transform domain metric learning (TDML) proposed in (Sun et al. 2009) is also resembled as a translationinvariant counterpart of LDA. Experimental results demonstrate significant improvements of algorithm efficiency and performance boosts on the small sample size problems.
IMED and GED
Given an image X of size \(n_1 \times n_2\), the vectorization of X is the vector \({{\mathbf {x}}}= \mathrm{vec} \left( X \right) \), such that the \(\left( n_2 i_1 + i_2 \right) \)th component of \({{\mathbf {x}}}\) is the intensity at the \(\left( i_1, i_2 \right) \) pixel. This is a common technique to manipulate image data.
The assumption made in the standard Euclidean distance that the image pixels are spatially independent sometimes leads to counterintuitive results (Jean 1990; Wang et al. 2005). To solve the problem, Wang et al. (2005) proposed the image Euclidean distance (IMED) defined as
The entries \(g_{i j}\) of the metric matrix G are defined by the Gaussian function (Wang et al. 2005), i.e.,
where \(P_i = \left( i_1, i_2 \right) , P_j = \left( j_1, j_2 \right) \). The \(n_1 n_2 \times n_1 n_2\) metric matrix G solely defines the IMED, where the element \(g_{ij}\) represents how the component \(x_i\) affects the component \(x_j\).
As suggested in (Wang et al. 2005), the calculation of IMED can be simplified by decomposing G to \(A^T A\). The standardizing transform (ST) is the special case when \(A^T = A\), written as \(A = G^{\frac{1}{2}}\). By incorporating the standardizing transform matrix \(G^{\frac{1}{2}}\), IMED can be easily embedded into almost any recognition algorithm. That is, feeding the STtransformed image \(G^{\frac{1}{2}} {\mathbf {x}}\) to a recognition algorithm automatically embeds IMED. Besides, Wang et al.showed that ST seems to have a smoothing effect (Wang et al. 2005) by illustrating a few eigenvectors associated with the largest eigenvalue of \(G^{\frac{1}{2}}\), and then argued that since IMED is equivalent to a transform domain smoothing, it can tolerate small deformation and noises and hence improve recognition performances.
Another image metric, called the generalized Euclidean distance (GED) (Jean 1990), is essentially the same as IMED, except the distance measure coefficients between \(P_i\) and \(P_j\). Specifically, the generating function for GED is the probability density function of the Laplace distribution
where \(\alpha \) is a scale parameter.
As pointed out in (Wang et al. 2005), translation invariance (TI) is a necessary property for any intuitively reasonable image metric. Formally, for image X, Y, a distance measure \(d \left( \cdot , \cdot \right) \) is translation invariant if and only if
where \(X_{\tau }, Y_{\tau }\) is an image translation of X, Y, respectively.
Both IMED and GED depend only on the relative position between pixels \(P_i\) and \(P_j\), i.e., there exists a discrete function \(g[\cdot ,\cdot ]\), such that
where
This makes \(g_{ij}\) invariant to image translation. However, the associated transform (ST and GET) are not translation invariant transforms. This left a problem whether IMED and GED can be decomposed to translation invariant transforms. That is, for any IMED or GED metric matrix G, does there exist a translationinvariant transform H such that \(G = H^T H\) ?
The translation invariant transform of a translation invariant metric
In (Sun et al. 2009), the authors give a positive answer to the problem whether a translation invariant metric can be implemented by a translation invariant transform.
Theorem 1
Given a translation invariant metric matrix G of \(n\times n\) and thus a finitely sequence \(g[ij] = G(i,j)\) supported on \([n,n]\) , supposing that \(\hat{g}(\omega ) \geqslant 0\) (the discrete time Fourier transform of g[i]), there exists a translation invariant transform matrix H such that
Specifically, define the filter h[i]
which satisfies that
If h[i] is supported on \([m, m]\) , it can be equivalently written as
where H is the \((n + 2 m) \times n\) LTI matrix of h[i] defined by
Each diagonal of H is constant, thus H is a Toeplitz matrix (Gray 2006 ) or diagonalconstant matrix.
A solid requirement of Theorem 1 is \(\hat{g}(\omega ) \geqslant 0\). The condition is satisfied when \(G \geqslant 0\) is an infinitesized matrix, as a consequence of the positive operator theorem (Rudin 1991) or the generalized Bochner’s theorem on groups (Rudin 1990). In practice, G is a positivedefinite matrix of finite size \(n \times n\). Gray (2006) proved that as n approximates infinity, \(\hat{g}(\omega )\) converges to a nonnegative value.
Unlike the case of ST for IMED (Wang et al. 2005) and GET for GED (Jean 1990), the constructed translationinvariant transform matrix H is not a square matrix. Specifically, H is of size \((n+2m) \times n\), where \([m,m)\) is the support of the sequence g[i].
Methods
Computational aspects
Unfortunately, Theorem 1 is presented in the continuous frequency domain only (Sun et al. 2009), which is not easy to be applied directly in practical problems because \(\hat{g}(\omega )\) is a continuous function that has to be discretized. A naive extension of Theorem 1 can be constructed by using the circular convolution (Oppenheim et al. 1999) instead of the regular convolution.
Proposition 2
If \(H_n\) is a circulant matrix, then the \(n \times n\) metric matrix (which is also circulant) defined by \(G_n = H_n^T H_n\) can be determined by
where g [i] is the autocorrelation function of h [i], i.e.,
with \(h^{*} [i] = \overline{h [ i]}\) , where \({\circledast }_n\) denotes the npoint circular convolution, or equivalently in frequency domain,
The above extension has problems. The first problem is that, for the same filter h[i], the induced metric filters \(g = h *h\) and \(\tilde{g} = h {\circledast }_n h\) are different, i.e.,
because linear convolution and circular convolution don’t equal generally.
The second problem is even worse: to derive a translationinvariant transform in discrete frequency domain, the matrix representation of the metric \({\mathbf {G}}\) must be a circulant matrix, which is not true for common cases, including both IMED and GED.
We adopt the following approach to overcome these problems: padding the finitely supported sequences to periodic sequences. Given h[i] supported on \([m,m)\) and x[i] supported on [0, n), define \(\tilde{h}[i]\) and \(\tilde{x}[i]\) of period\((n+2m)\) by
and
By the circular convolution theorem (Oppenheim et al. 1999), the two types of convolution coincide:
In other words, the linear convolution of h and x on its support is a period of the circular convolution of their periodic expansion \(\tilde{h}\) and \(\tilde{x}\).
Now consider the two versions of metric filter: \(g[i] = h *h^{*} [i]\) and \(\tilde{g} [i] = \tilde{h} {\circledast }_{n+2m} \tilde{h}^{*} [i]\). Because
hence \(g \left[ i \right] = \tilde{g} \left[ i \right] \) if and only if
On the other hand, by definition the metric filter is conjugate symmetric, i.e,,
so it can be asserted that \(g \left[ i \right] = \tilde{g} \left[ i \right] \) when \(i \in \left(  n, n \right) \).
The above statements assert that given a finitely supported translationinvariant transform h[x], the induced metric \(\tilde{g}[i]\) constructed by the padded period filter \(\tilde{h}[i]\) is also translation invariant.
Hence, the analogous version of Theorem 1 can be given as follows.
Theorem 3
Given the \([m, m)\) supported metric filter g[i], there exists a circular filter \(\tilde{h} [i]\) , such that g[i] is equal to \(\tilde{h} {\circledast }_{n+2m} \tilde{h} [i]\) on its support.
Proof
Define the period\((n + 2 m)\) sequence \(\tilde{g}\) by
Let \(\tilde{h} [i] =\mathcal {F}^{ 1} \left( \sqrt{\widehat{\tilde{g}} \left[ i \right] } \right) \) and the proof is complete. \(\square \)
It is beneficial to derive the matrix representation of Theorem 3. Given the \(n \times n\) metric matrix \(G_n\), by Theorem 1, it determines a filter h[i] supported on \([m,m)\), and hence the \((n+2m) \times n\) translateinvariant matrix \(H_{m,n}\); by theorem 3, it determines a filter \(\tilde{h} [i]\) of period \(n+2m\), and hence the \((n+2m) \times (n+2m)\) circular matrix \(\tilde{H}_{m,n}\). Writing
and it can be checked that \(G_n\) is the leftupper \(n \times n\) block of \(\tilde{G}_{n + 2 m}\).
The results in discrete frequency domain can be easily extended to multidimensional signal space the same as in continuous frequency domain (Sun et al. 2009). A convenient property of the extension is that the multidimensional data (e.g, 2d images) can be processed without vectorization.
The translationinvariant transforms of IMED and GED
To demonstrate that the proposed method can be applied to multidimensional cases directly, we write the metric matrices of IMED and GED in tenser form.

IMED The metric tensor \(\mathbbm {g}\) for IMED is defined in (Wang et al. 2005) by a Gaussian, i.e.,
$$\begin{aligned} \mathbbm {g}_{j_1 j_2}^{i_1 i_2} = \frac{1}{2 \pi } e^{ \frac{d^2}{2}}, \end{aligned}$$where
$$\begin{aligned} d = \sqrt{(i_1  j_1)^2 + (i_2  j_2)^2}. \end{aligned}$$The metric filter for IMED is separable, i.e.,
$$\begin{aligned} g[i_1, i_2] = \frac{1}{2 \pi } e^{ \frac{i_1^2 + i_2^2}{2}} = \frac{1}{\sqrt{2 \pi }} e^{ \frac{i_1^2}{2}} \cdot \frac{1}{\sqrt{2 \pi }} e^{ \frac{i_2^2}{2}} = g_0 [i_1] g_0 [i_2]. \end{aligned}$$We choose the support length \(m_1 = m_2 = 4\) (\(g [4, 4] \approx 1.7911 \times 10^{ 8}\)), i.e., \(g [i_1, i_2]\) is supported on \([ 4, 4] \times [ 4, 4]\). For \(52 \times 52\) signals (\(n_1 = n_2 = 52\)), we build the period \(n_1 + 2 m_1 = 60\) sequence
$$\begin{aligned} \widetilde{g_0}[i]={\left\{ \begin{array}{ll} g_0 [i] = \frac{1}{\sqrt{2 \pi }} e^{\frac{i^2}{2}}, &{} i \in [4,4] \\ 0, &{} i \in (4,56). \\ \end{array}\right. } \end{aligned}$$It is easy to validate that \(\widehat{\widetilde{g_0}} [j] \geqslant 0, \forall j\). Thus the separated period filter \(\widetilde{h_0} [i]\) can be constructed by
$$\begin{aligned} \widetilde{h_0} [i] =\mathcal {F}^{ 1} \left( \sqrt{\widehat{\tilde{g}} [j]} \right) , \end{aligned}$$and the overall filter is \(\tilde{h} [i_1, i_2] = \widetilde{h_0} [i_1] \widetilde{h_0} [i_2]\).

GED The metric tensor \(\mathbbm {g}\) for GED is defined in (Jean 1990) by a Laplacian, i.e.,
$$\begin{aligned} \mathbbm {g}_{j_1 j_2}^{i_1 i_2} = r^d = e^{d \log r}, \end{aligned}$$where \(d =  i_1  j_1  +  i_2  j_2 \) is the \(l_1\) distance of the two pixels and \(r = 0.6\) is a decay constant. The metric filter for GED is separable, i.e.,
$$\begin{aligned} g [i_1, i_2] = r^{ i_1  +  i_2 } = r^{ i_1 } \cdot r^{ i_2 } = g_0 [i_1] g_0 [i_2]. \end{aligned}$$We choose the support length \(m_1 = m_2 = 15\) (\(g [15, 15] \approx 2.2107 \times 10^{ 7}\)), i.e., \(g [i_1, i_2]\) is supported on \([ 15, 15] \times [ 15, 15]\). For \(30 \times 30\) signals (\(n_1 = n_2 = 30\)), we build the period \(n_1 + 2 m_1 = 60\) sequence
$$\begin{aligned} \widetilde{g_0}[i]={\left\{ \begin{array}{ll} g_0[i] = r^{\vert i \vert }, &{} i \in [15,15] \\ 0, &{} i \in (15,45). \\ \end{array}\right. } \end{aligned}$$We can validate that \(\widehat{\widetilde{g_0}} [j] \geqslant 0, \forall j\). Thus the separated period filter \(\widetilde{h_0} [i]\) can be constructed by
$$\begin{aligned} \widetilde{h_0} [i] =\mathcal {F}^{ 1} \left( \sqrt{\widehat{\tilde{g}} [j]} \right) , \end{aligned}$$and the overall filter is \(\tilde{h} [i_1, i_2] = \widetilde{h_0} [i_1] \widetilde{h_0} [i_2]\).
The translationinvariant transforms of IMED and GED in space and frequency domain are drawn in Fig. 1. It clearly shows that applying the GED or IMED is equivalent to a lowpass filtering process, which is robust to small perturbation of images.
The fast implementation of IMED and GED
The advantages of the filtering decomposition over the GET or ST are not only the physical explanation but also the time and space complexity. Generally, the computational complexity associated with the filtering decomposition can be of \(O (n \log n)\) due to the efficiency of FFT (Oppenheim et al. 1999).
In the case of IMED and GED, since the corresponding filters decay rapidly (Fig. 1), e.g, \(g[4] = \frac{1}{\sqrt{2 \pi }} e^{ \frac{4^2}{2}} \approx 1.34 \times 10^{ 4}\) (IMED), the vector \({\mathbf {g}} = (g[0], \ldots , g [m], 0, \ldots , 0, g[m], \ldots , g[1])^T\) can be set of length n. Therefore \(G \approx \tilde{G}\) and the transform can be applied on the original X than the zeropadded image \(\tilde{X}\). Finally, the period filter \(\tilde{g}\) can be built using only several significant values. The templates of IMED (\(\sigma =1\)) and GED (\(\alpha =2\)) are
and
respectively.
Since the filter is of fixed size, the fast implementation can further reduces the space complexity from \(O(n^2)\) to O(1), and the time complexity from \(O(n^2)\) to O(n).
Transform domain metric learning
Generally, in order to learn a metric G, one can do optimization with respect to G. For images of size \(n_1 \times n_2\), G has \(n_1^2 \times n_2^2\) elements, making the optimization intractable. Another problem is G must satisfy the positive semidefinite constraint, i.e., \(G \geqslant 0\), so it is not easy to find efficient algorithm to solve problem with such a constraint.
Theorem 1 can be equivalently written
Equation (2) introduces great simplifications to the optimization problem of metric learning. With the translationinvariant assumption on G, things are much simpler. This is because the positive semidefinitive constraint \(G \geqslant 0\) is reduced to a bound constraint \(\hat{g} (\varvec{\omega } ) \geqslant 0\). Furthermore, the number of parameters is the sampling number on \(\hat{g}\), which is usually chosen to be the same as the size of input data. An additional benefit of the translationinvariant approach is that it applies to any dimensionality without modifications, thus is unnecessary to stack the multidimensional data to vectors.
Suppose we have some data \(\left\{ x_i \right\} \), and are given the data label \(\left\{ y_i \right\} \). Let \(f_i\) be the Fourier transform of \(x_i\), we compute the total “similar” and “dissimilar” power spectrum:
The criterion here is that the filtered withinclass distance is minimized, and the filtered betweenclass distance is maximized, simultaneously. This gives the objective functional
The objective (3) resembles the idea of LDA (Duda et al. 2000). In fact, TDML can be viewed as a translateinvariant solution to LDA.
Results
Experiments on the transform implementations of IMED
In this section, the standardizing transform (ST) and the translation invariant implementation of IMED are evaluated using the US postal service (USPS) and the FERET database. The USPS database consists of 16 by 16 pixel size normalized images of handwritten digits, divided into a training set of 7291 prototypes and a test set of 2007 pattern. The FERET database consists of 384 by 256 pixel size images of human faces, in which th ’fa’ subset is chosen, including 1762 images.
The following algorithms are going to be compared, divided into 2 gourps:

1
The ST group

Algorithm 1 \(U = G^{\frac{1}{2}} \mathrm{vec} (X)\), the original ST. It is memory expensive, and sometimes unfeasible, e.g, for the FERET database, the \(G^{\frac{1}{2}}\) is of size \(98304 \times 98304\), yielding a 36GiB usage of memory (4 bytes per element).

Algorithm 2 Since G is separable Wang et al. (2005), it can be shown \(G_1^{\frac{1}{2}} X G^{\frac{1}{2}}_2\) is equivalent to Algorithm 1. This solves the memory problem. For the FERET database, only a \(384 \times 384\) and a \(256 \times 256\) matrices are needed.

2
The CST group (translation invariant transforms)

Algorithm 3 \(({\mathbf {h}}_1 \otimes {\mathbf {h}}_2^{*}) *X\), we need only a precomputed \(5 \times 5\) template.

Algorithm 4 Apply the template \({\mathbf {h}}_1\) to each column of X, then \({\mathbf {h}}_2\) to each row of X. This is the separated equivalent to Algorithm 3, in compared with Algorithm 2. Because \({\mathbf {h}}_1 ={\mathbf {h}}_2\), only one copy is in memory.
These algorithms were evaluated over the \(7291 + 2007\) USPS images, and the 1762 FERETfa images using MATLAB on a Dell PowerEdge 1950. The results (Table 1) demonstrate that the CST does improve the time efficiency significantly, especially in the case of large size images.
Also, we computed the Euclidean distance of CSTed images, which has an error rate of \(\sim 1\%\) comparing to the IMED of the original images, due to the approximate property of the convolution template.
Experiments on the transform domain metric learning
In this section, we conduct several sets of experiments. The experiments are performed on 3 face data sets (UMIST, Yale and ORL database). The images in UMIST, Yale and ORL data sets are resized to \(28 \times 23\), \(40 \times 30\) and \(28 \times 23\), respectively.^{Footnote 1} We randomly select two images from each class as the training set, and use the remaining images for test. We repeat the process 20 times independently and the average results are calculated.
We first compare TDML with several other metrics, including the standard Euclidean distance (ED), IMED, GED, and a metric learning method XNZ (Xiang et al. 2008). The performances are evaluated in terms of recognition rate using a nearest neighbor classifier. The recognition results are shown in Table 2. TDML significantly outperforms all metrics.
Another set of experiments was to test whether embedding the learned TI metric in an image recognition technique, e.g., SVM (Vapnik 1998), can improve that algorithm’s accuracy. Embedding a TI metric in an algorithm is simple: first, transform all images by the corresponding TI transform, and then run the algorithm with the transformed images as input data.
Table 3 gives the results of the metric when embedded to SVM. It can be found that TDML improves the performance of SVM better than IMED and GED.
Conclusion
In this paper, we extend the equivalency in (Sun et al. 2009) to the discrete frequency domain. We show that GED and IMED are lowpass filters, resulting in fast implementations which reduce the space and time complexities significantly. The transform domain metric learning (TDML) proposed in (Sun et al. 2009) is also resembled as a translationinvariant counterpart of LDA. Experimental results demonstrate significant improvement of algorithm efficiency and performance boosts on small sample size problems.
One possible future direction is the search for more effective metric learning algorithm. TDML is a simple and intuitive attempt and we expect novel methods that combine the concepts of margins, kernels, locality and nonlinearity.
Notes
 1.
The resization is necessary for traditional subspace and metric learning methods since they are vulnerable to the computational issue and small sample size problem from the curse of dimensionality. Our method doesn’t suffer from it.
References
BarHillel A, Hertz T, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. Proc Int Conf Mach Learn 11–18
Chopra S, Hadsell R, LeCun Y (2005) Learning a similarity metric discriminatively, with application to face verification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005, vol. 1, pp 539–5461. doi:10.1109/CVPR.2005.202
Chen J, Wang R, Shan S, Chen X, Gao W (2006) Isomap based on the image euclidean distance. In: 18th International Conference on Pattern Recognition, 2006. ICPR 2006. vol. 2, pp 1110–1113. doi:10.1109/ICPR.2006.729
Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Informationtheoretic metric learning. In: Proceedings of the 24th International Conference on Machine Learning. ICML ’07, ACM, New York, NY, USA, pp. 209–216. doi:10.1145/1273496.1273523. http://doi.acm.org/10.1145/1273496.1273523. Accessed 15 May 2013
Duda RO, Hart PE, Stork DG (2000) Pattern Classification, 2nd edn. WileyInterscience (2000)
Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2005) Neighbourhood components analysis. In: Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems, vol 17. MIT Press, Cambridge, MA, pp 513–520
Globerson A, Roweis S (2006) Metric learning by collapsing classes. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, pp 451–458
Gray RM (2006) Toeplitz and circulant matrices: a review. Found Trends Commun Inform Theory 2(3):155–239
Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans Pat Anal Mach Intel 18(6):607–616. doi:10.1109/34.506411
Jean JSN (1990) A new distance measure for binary images. In: International Conference on Acoustics, Speech, and Signal Processing, 1990. ICASSP90., pp. 2061–2064. doi:10.1109/ICASSP.1990.115932
Lebanon G (2006) Metric learning for text documents. IEEE Trans Pat Anal Mach Intel 28(4):497–508. doi:10.1109/TPAMI.2006.77
Li F, Yang J, Wang J (2007) A transductive framework of distance metric learning by spectral dimensionality reduction. In: Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007), pp 513–520
Oppenheim AV, Schafer RW, Buck JR (1999) DiscreteTime Signal Processing, 2nd edn., Prentice Hall Signal Processing Series, Prentice Hall, Englewood Cliffs
Rudin W (1991) Functional Analysis, 2nd edn. McGrawHill Book Company, New York
Rudin W (1990) Fourier Analysis on Groups. Wiley, New York
ShalevShwartz S, Singer Y, Ng AY (2004) Online and batch learning of pseudometrics. In: Proceedings of the Twentyfirst International Conference on Machine Learning. ICML ’04, ACM, New York, p 94. doi:10.1145/1015330.1015376.http://doi.acm.org/10.1145/1015330.1015376. Accessed 11 03 2013
Shental N, Hertz T, Weinshall D, Pavel M (2002) Adjustment learning and relevant component analysis. In: ECCV ’02: Proceedings of the 7th European Conference on Computer VisionPart IV, Springer, London, pp. 776–792
Sun B, Feng J (2008) A fast algorithm for image euclidean distance. In: Chinese Conference on Pattern Recognition, 2008. CCPR ’08, pp 1–5. doi:10.1109/CCPR.2008.32
Sun B, Feng J, Wang L (2009) Learning IMED via shiftinvariant transformation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp 1398–1405. doi:10.1109/CVPR.2009.5206720
Vapnik VN (1998) Statistical Learning Theory. WileyInterscience
Weinberger KQ, Blitzer J, Saul LK (2005) Distance metric learning for large margin nearest neighbor classification. In: Advances in Neural Information Processing Systems, vol. 18, pp 1473–1480. http://books.nips.cc/papers/files/nips18/NIPS20050265.pdf
Wang L, Zhang Y, Feng J (2005) On the euclidean distance of images. IEEE Trans Pat Anal Mach Intel 27(8):1334–1339. doi:10.1109/TPAMI.2005.165
Wang R, Chen J, Shan S, Chen X, Gao W (2006) Enhancing training set for face detection. In: 18th International Conference on Pattern Recognition, 2006. ICPR 2006. vol. 3, pp 477–480. IEEE Computer Society, Washington, DC. doi:10.1109/ICPR.2006.493
Xing EP, Ng AY, Jordan MI, Russell S (2003) Distance metric learning, with application to clustering with sideinformation. In: Advances in Neural Information Processing Systems 15, vol. 15, pp 505–512. http://citeseerx.ist.psu.edu/viewdoc/summary?. doi:10.1.1.58.3667
Xiang S, Nie F, Zhang C (2008) Learning a mahalanobis distance metric for data clustering and classification. Pat Recogn 41(12):3600–3612. doi:10.1016/j.patcog.2008.05.018
Zhu S, Song Z, Feng J (2007) Face recognition using local binary patterns with image euclidean distance. In: SPIE, vol. 6790. doi:10.1117/12.750642
Authors’ contributions
BS proposed the idea of translationinvariant metric and proved the main theoretical results, JFF and GPW participated in its design and coordination and helped to revise the manuscript presentation of this method. All authors read and approved the final manuscript.
Acknowledgements
This work was supported by NSFC(61333015) and NBRPC(2010CB328002, 2011CB302400).
Competing interests
The authors declare that they have no competing interests.
Author information
Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Sun, B., Feng, J. & Wang, G. On the translationinvariance of image distance metric. Appl Inform 2, 11 (2015). https://doi.org/10.1186/s4053501500146
Received:
Accepted:
Published:
Keywords
 Image Euclidean distance
 Translation invariant
 Linear discriminant analysis
 Transform domain metric learning