# Kernel fractional affine projection algorithm

- Bilal Shoaib†
^{1}Email author, - Ijaz Mansoor Qureshi†
^{2}, - Shafqat Ullah Khan
^{3}, - Sharjeel Abid Butt
^{1}and - Ihsan ul haq
^{1}

**Received: **20 April 2015

**Accepted: **19 November 2015

**Published: **14 December 2015

## Abstract

This paper extends the kernel affine projection algorithm to a rich, flexible and cohesive taxonomy of fractional signal processing approach. The formulation of the algorithm is established on the inclusion of Riemann–Liouville fractional derivative to gradient-based stochastic Newton recursive method to minimize the cost function of the kernel affine projection algorithm. This approach extends the idea of fractional signal processing in reproducing kernel Hilbert space. The proposed algorithm is applied to the prediction of chaotic Lorenz time series and nonlinear channel equalization. Also the performance is validated in comparison with the least mean square algorithm, kernel least mean square algorithm, affine projection algorithm and kernel affine projection algorithm.

### Keywords

Kernel affine projection algorithm Riemann–Liouville derivative Lorenz time series Fractional signal processing approach## Background

Kernel-based learning algorithms gained interest since the last few years. Mercer’s theorem is used in kernel-based learning algorithms to map the input data using some nonlinear kernel function to some higher dimensional feature space, known as reproducing kernel Hilbert space (RKHS), where the linear operations are easily performed on the input data. These kernel methods stem originally from support vector machines (Vapnik and Vapnik 1998; Hearst et al. 1998), a powerful tool in handling classification problems in the neural network architecture. Kernel principal component analysis (KPCA) and kernel regression (Scholkopf et al. 1997; Takeda et al. 2007; Hardle and Vieu 1992) also show desirable performance regarding classification in the complicated environment of statistical signal processing. However, these are batch mode methods and suffer the burden of high computational cost and memory usage. These issues are replaced by introducing the online kernel methods, such as kernel least mean square (KLMS) (Liu et al. 2008), kernel affine projection algorithm (KAPA) (Liu and Principe 2008), kernel recursive least squares (KRLS) (Engel et al. 2004; Liu et al. 2015) and extended kernel recursive least squares (Ex-KRLS) (Liu et al. 2009) algorithms. These online kernel algorithms are very much in common nowadays regarding system identification, weather forecasting, nonlinear channel equalization, prediction of stationary as well as nonstationary time series. KLMS algorithm uses stochastic gradient method to minimize the mean square error-based cost function in its formulation on the transformed input data. In KAPA the gradient noise of the KLMS algorithm is removed by minimizing the cost function using the smoothed nonlinear Newton recursion method.

On the other hand, online learning algorithms based on fractional signal processing is introduced using the concept of fractional order calculus in the formulation of the algorithm. Ortigueira’s work (Ortigueira et al. 2002; Ortigueira and Machado 2006; Ortigueira 2011) mainly considered as the pioneer in the field of fractional signal processing. Tseng et al. designed one- and two-dimensional finite impulse response filter using constraints regarding fractional derivative (Tseng and Lee 2012, 2013, 2014). Wang introduces fractional zero phase filtering based on Riemann–Liouville integrals (Wang et al. 2014). Raja and Qureshi introduced fractional least mean square algorithm (FLMS) (Zahoor and Qureshi 2009) for their work regarding system identification. In the recent past, FLMS algorithm has been applied to various multidimensional signal processing problems including parameter identification of nonlinear controlled autoregressive system, parameter estimation of CARMA systems (Zahoor and Chaudhary 2015), identification of Box-jenkins system, dual channel speech enhancement, Brownian motion modeling, performance analysis of the bessel beamformer, acoustic echo cancelation (Masoud and Osgouei 2011; Dubey and Rout 2012; Akhtar and Yasin 2012; Chaudhary et al. 2013), etc.

Recently a modified fractional least mean square algorithm (MFLMS) (Shoaib and Qureshi 2014a) is developed for stationary and nonstationary time series prediction, more specifically Mackey glass. Convergence of the MFLMS algorithm is also tested regarding the prediction of chaotic series along with different noise variances. To remove the guesswork existing in tuning the step size parameter of the MFLMS algorithm (Shoaib and Qureshi 2014b), a stochastic gradient-based method is introduced to adapt the step sizes of the MFLMS algorithm according to the mean square error and then its application towards the prediction of Mackey glass as well as Lorenz time series.

Kernel functions are widely used in obtaining the solution of fractional order nonlinear differential equations as discussed in Shoaib and Qureshi (2014a). They use different kernel function to model the fractional order nonlinear differential equation and then use heuristic computing techniques like genetic algorithm, particle swarm optimization (PSO), differential evolution (DE) to minimize the error function. Here in this paper we introduced a mechanism that combined the adaptive fractional learning algorithms and online kernel-based filtering algorithms. This idea greatly helps in improving the performance in solving nonlinear problems.

The main aim of this research work is the development of a kernel fractional affine projection algorithm (KFAPA). A method is introduced to adjust the Riemann–Liouville fractional derivative to formulate the KAPA algorithm to minimize the cost function based on mean square error using gradient-based smoothed nonlinear recursive method. The proposed algorithm is then applied on the prediction of only the X-component of the three-dimensional chaotic Lorenz time series and nonlinear channel equalization.

Organization of the paper is as follows: in “Affine projection and kernel affine projection algorithms” section, the brief introduction of affine projection and kernel affine projection algorithm is presented. “Fractional signal processing approach” section, presents the introduction of fractional signal processing and the proposed kernel fractional affine projection algorithm. The experimental results are discussed in “Experimental results” and “Conclusion and future work” sections comprises of conclusion along with future directions.

## Affine projection and kernel affine projection algorithms

### Affine projection algorithm

**x**(i),

*d*(i)] as

*w*is

#### Kernel affine projection algorithm

**x**and

**d**is somehow highly nonlinear. Nonlinear mapping is introduced in Weifeng et al. (2011) as \(\varphi (\mathbf{x}(i))\), which is a powerful model \(\mathbf{w}^\mathrm{T}\varphi (\mathbf{x}(i))\) than \(\mathbf{w}^\mathrm{T}{} \mathbf{x}\). So using this model and finding

**w**through smoothed stochastic Newton method may prove an efficient method towards nonlinear filtering as APA ensures for linear problems. Using the sequence \([\mathbf{\varphi }(i),d(i)]\) to parameter weight vector \(\mathbf{w}\) as

## Fractional signal processing approach

### Introduction to fractional derivative

### Proposed kernel fractional affine projection algorithm

## Experimental results

This section presents experimental results to reveal the performance of the proposed algorithm. The performance of KFAPA is validated by the prediction of X-component of Lorenz time series and equalization of nonlinear channel.

### Time series prediction

*M*is the order of the filter. The first input pattern is delivered to predictor for estimating the future value. Then the weight vector is updated by a law based on the function of mean square error as given below.

### Lorenz time series

*x*(0) = 1,

*y*(0) = 1,

*z*(0) = 1, a sampling period is taken 0.01 second, also it is fixed to obtain the sample data using first-order approximation method. The state trajectory of the Lorenz system is shown in Fig. 2. The next experiment is performed with training sample points 500–1000 of the X-component of the Lorenz series, and test sample points 1000–1200 to evaluate the performance of the proposed algorithm. The time embedding length or the order of the filter

*M*is 5 for this experiment. To validate the performance of the proposed algorithm, learning curves in terms of mean square error (MSE) as a figure of merit are plotted in Fig. 3.

Performance comparison of LMS, APA, KAPA and KFAPA for X-component of Lorenz series prediction with different noise levels

Algorithm | LMS | APA | KAPA | KFAPA |
---|---|---|---|---|

Training MSE ( | 0.02250 ± 1.75e−005 | 0.01454 ± 5.26e−004 | 0.041827 ± 1.93e−065 | 0.02156 ± 1.39e−005 |

Training MSE ( | 0.01583 ± 0.22e−005 | 0.07820 ± 0.00306 | 0.017738 ± 2.28e−003 | 0.091538 ± 1.367e−004 |

Training MSE ( | 0.01903 ± 0.003149 | 0.025175 ± 0.000198 | 0.001399 ± 0.000189 | 0.0020052 ± 6.50e−005 |

Training MSE ( | 0.002970 ± 0.00170 | 0.005556 ± 0.0004324 | 0.001356 ± 0.000131 | 0.0027892 ± 9.49e−004 |

Training MSE ( | 0.004349 ± 0.0003927 | 0.00680 ± 0.0007169 | 0.004117 ± 0.000251 | 0.0048219 ± 0.0001680 |

Training MSE ( | 0.0049979 ± 0.0004128 | 0.007162 ± 0.0009767 | 0.005117 ± 0.000382 | 0.005822 ± 0.0003391 |

Training MSE ( | 0.015863 ± 0.0009678 | 0.010932 ± 0.0025963 | 0.026628 ± 0.00094904 | 0.045301 ± 0.00086739 |

Training MSE ( | 0.016166 ± 0.007296 | 0.019066 ± 0.0044241 | 0.035729 ± 0.00128194 | 0.0555606 ± 0.0023143 |

Training MSE ( | 0.42356 ± 0.10011 | 0.50001 ± 0.010242 | 0.82530 ± 0.2115 | 0.4209 ± 0.030332 |

Training MSE ( | 0.51752 ± 0.22074 | 0.69218 ± 0.028178 | 0.91918 ± 0.32231 | 0.52013 ± 0.047689 |

### Nonlinear channel equalization

*b*(1),

*b*(2),

*b*(3),…,

*b*(

*k*)] is fed into a nonlinear channel, while adding static nonlinearity and additive white Gaussian noise the signal will be observed as [

*r*(1),

*r*(2),

*r*(3),…,

*r*(

*k*)]. The channel model is defined as \(h(i)=b(i)+0.5b(i-1)\) and output is \(r(i)=h(i)-0.9h(i)^{2}+n(i)\), where

*n*(

*i*) is the additive white Gaussian noise having variance of 0.01. We aim here in this experiment to reproduce the original signal with low error rate. The time embedding length or the order of the filter is 5. 5000 symbols are used to train the coefficients of the nonlinear channel and the mean square error during training is displayed in Fig. 4. Figure 5 shows that during training, the MSE curve of the proposed algorithm is slightly better than its counterparts and the results are also displayed in tabular form in Table 2. The performance of the proposed algorithm is also tested in Fig. 6 by inserting an abrupt change at iteration 500. It can be easily observed that the proposed algorithm is able to recover efficiently in comparison with its counterparts and the improvement of 0.1 dB is achieved.

Performance comparison of APA, KAPA and KFAPA in nonlinear channel equalization

Algorithm | MSE (dB) |
---|---|

APA | 0.6 ± 0.2 |

KAPA | 0.55 ± 0.05 |

KFAPA | 0.4 ± 0.1 |

### Atmospheric CO_{2} concentration forecasting

_{2}concentrations (in parts per million by volume ppmv) in atmosphere collected at Mauna Loa observatory Hawaii, between 1958 and 2008, with 600 total observations. the first 400 points are used for training while the other 200 for testing. The kernel function for this specific problem handles long-term rising, seasonal effect, periodicity and some irregularities. The kernel function is

Kernel function parameter values

\(a_{1}\) | \(a_{2}\) | \(a_{3}\) | \(a_{4}\) | \(a_{5}\) | \(a_{6}\) | \(a_{7}\) | \(a_{8}\) | \(a_{9}\) | \(a_{10}\) | \(a_{11}\) |
---|---|---|---|---|---|---|---|---|---|---|

66 | 0.075 | 0.40 | 0.0576 | 1.0878 | 0.6600 | 0.4167 | 0.78 | 0.18 | 3.7509 | 0.1900 |

### Static function approximation

*v*(

*i*)] is a zero mean Gaussian noise with variance \(\sigma _{v}^{2}\). In this experiment,

*N*= 2000 samples are generated with \(\sigma _{v}^{2}=0.01,\; \omega =2\) and \(\tau =1.0\). 500 samples are used for training and another 200 are used for testing. The test pattern is shown in Fig. 10. Figure 11 illustrates the convergence curves for APA, KAPA and KFAPA. MSE denotes the mean square error. Simulation results clearly indicate that the performance of the proposed algorithm has been perfectly good as listed in Table 4.

Training and testing MSE

Algorithm | Training MSE | Testing MSE |
---|---|---|

APA | 0.35 ± 0.05 | 0.3 ± 0.02 |

KAPA | 0.2 ± 0.04 | 0.22 ± 0.05 |

KFAPA | 0.11 ± 0.1 | 0.13 ± 0.3 |

## Conclusion and future work

In this paper, a new kernel fractional affine projection algorithm is presented. Affine projection and kernel affine projection algorithms has also been discussed. One application of predicting a chaotic three-dimensional Lorenz system is presented that demonstrates the performance of the proposed algorithm in comparison with LMS, APA, KAPA and in terms of mean square error as a figure of merit. Proposed algorithm is also tested on nonlinear channel equalization. This new formulation is another contribution in the field of nonlinear signal processing.

## Notes

## Declarations

### Authors’ contributions

BS proposed and implemented the idea. Dr IMQ and Dr IU are supervisor and cosupervisor respectively. SAB and SUK did the drafting and paper writng process. All authors read and approved the final manuscript.

### Competing interests

The authors declare that they have no competing interests.

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- Akhtar P, Yasin M (2012) Performance analysis of bessel beamformer and LMS algorithm for smart antenna array in mobile communication system. In: Emerging Trends and Applications in Information Communication Technologies, vol 281. Springer Berlin Heidelberg, pp 52–61Google Scholar
- Chaudhary NI, Raja MAZ, Khan JA, Aslam MS (2013) Identification of input nonlinear control autoregressive systems using fractional signal processing approach. Sci World J 2013:1–13 (ID 467276) Google Scholar
- Dubey SK, Rout NK (2012) FLMS algorithm for acoustic echo cancellation and its comparison with LMS. In: Proceedings of the 1st international conference on IEEE recent advances in information technology (RAIT)Google Scholar
- Engel Y, Mannor S, Meir R (2004) The kernel recursive least-squares algorithm. IEEE Trans Signal Process 52(8):2275–2285View ArticleMathSciNetGoogle Scholar
- Hardle W, Vieu P (1992) Kernel regression smoothing of time series. J Time Ser Anal 13(3):209–232View ArticleMathSciNetGoogle Scholar
- Haykin S (2013) Adaptive filter theory, 5 edn. Pearson Education, Limited, India (revised) Google Scholar
- Hearst MA et al (1998) Support vector machines. Intell Syst Appl IEEE 13(4):18–28Google Scholar
- Liu W, Pokharel PP, Principe JC (2008) The kernel least-mean-square algorithm. IEEE Trans Signal Process 56(2):543–554View ArticleMathSciNetGoogle Scholar
- Liu W et al (2009) Extended kernel recursive least squares algorithm. IEEE Trans Signal Process 57(10):3801–3814Google Scholar
- Liu W, Principe JC, Haykin S (2011) Kernel adaptive filtering: a comprehensive introduction, vol 57. John Wiley & SonsGoogle Scholar
- Liu W, Principe JC (2008) Kernel affine projection algorithms. EURASIP J Adv Signal Process 1(2008):784292Google Scholar
- Liu W, Principe JC, Haykin S (2010) Kernel recursive least-squares algorithm. In: Kernel Adaptive Filtering: A Comprehensive Introduction, pp 94–123Google Scholar
- Masoud G, Osgouei SG (2011) Dual-channel speech enhancement using normalized fractional least-mean-squares algorithm. In: Proceedings of the 19th Iranian conference on electrical engineering (ICEE)Google Scholar
- Ortigueira, MD, Machado JT, de Almeida R (2002) Special issue on fractional signal processing and applications. Signal Proc 82:1515Google Scholar
- Ortigueira MD, Machado JAT (2006) Fractional calculus applications in signals and systems. Signal Proc 86(10):2503–2504View ArticleMATHGoogle Scholar
- Ortigueira MD (2011) Fractional calculus for scientists and engineers, vol 84. Springer Science and Business MediaGoogle Scholar
- Raja MAZ, Chaudhary NI (2015) Two-stage fractional least mean square identification algorithm for parameter estimation of CARMA systems. Signal Process 107:327–339Google Scholar
- Scholkopf B, Smola A, Muller KR (1997) Kernel principal component analysis. Artificial Neural Networks ICANN 97. Springer, Berlin Heidelberg, pp 583–588Google Scholar
- Shoaib B, Qureshi IM (2014) A modified fractional least mean square algorithm for chaotic and nonstationary time series prediction. Chin Phys B 23(3):030502View ArticleGoogle Scholar
- Shoaib B, Qureshi IM (2014) Adaptive step-size modified fractional least mean square algorithm for chaotic time series prediction. Chin Phys B 23(5):050503View ArticleGoogle Scholar
- Takeda H, Farsiu S, Milanfar P (2007) Kernel regression for image processing and reconstruction. IEEE Trans Image Process 16(2):349–366View ArticleMathSciNetGoogle Scholar
- Tseng CC, Lee SL (2012) Design of linear phase FIR filters using fractional derivative constraints. Signal Process 92(5):1317–1327Google Scholar
- Tseng CC, Lee SL (2013) Designs of two dimensional linear phase FIR filters using fractional derivative constraints. Signal Proc 93(5):1141–1151Google Scholar
- Tseng CC, Lee SL (2014) Designs of fractional derivative constrained 1D and 2D FIR filters in the complex domain. Signal Proc 95:111–125Google Scholar
- Vapnik VN, Vapnik V (1998) Statistical learning theory, vol 1. Wiley, New YorkGoogle Scholar
- Wang J et al (2014) Fractional zero phase filtering based on the Riemann Liouville integral. Signal Proc. 98:150–157Google Scholar
- Zahoor RMA, Qureshi IM (2009) A modified least mean square algorithm using fractional derivative and its application to system identification. Eur J Sci Res 35(1):14–21Google Scholar