Regression in statistics refers to the capability of modeling and relating a series of one or more independent variables with one target dependent variable.
The regression equation can be thought of as a function of variables X (the independent variables) and β (the parameters of the regression):
Y = f(X,β)
where Y is the dependent variable.
Subject of the Regression Analysis is then finding the parameters β so that the sum of squared error residuals
Residuals = Σi
is minimized. This procedure is sometimes referred to as least squares estimation
Linear Multiple Regression
The linear regression assumes that the relationships between independent variables are all linear that is all the β coefficients appear with power one.
The following equations:
) + β2
) + β3
are both linear regressions even though the independent variables are combined using non-linear functions or powers because both
expressions are linear in the parameters β of the regression.
The general linear data model
+ … + εi
can be solved with respect to parameters β by solving the normal matrix equation:
You can download the following example of Linear Regression
built using the methods available in Ipredict's library.
The example builds a very simple one week ahead predictor of the Forex EUR/USD.
The logistic regression is used when the dependent variable assumes only two values partitioning in effect the output
like in life/death, male/female or buy/sell decisions. Then:
log(odds) = logit(P) = log(pi
/ (1.0 – pi
)) = β0
is the regression equation that will be minimized using a least squares
approach as in the linear regression case.
You can download the following example of Logistic Regression
The example builds a simple one week ahead predictor of the Forex EUR/USD using a Logistic Regression.
Tikhonov regularization, also known as Ridge Regression, is a common method used to regularize ill-posed problems.
The linear matrix equation
requires the inversion of the matrix
that can be ill-conditioned or singular (does not have an inverse). In this
case a solution can be found by solving an alternative problem:
X + δ I)-1
where I is the identity matrix. Of course for δ=0 this reduces to the standard unregularized least squares regression.
The optimal determination of the parameter δ is a very complex problem and this
parameter is normally determined using manual or ad-hoc methods.
Optimal Linear Predictor
The Optimal Linear Predictor is a digital filter that extrapolates linearly from past values the future values of a time-series.
This is related to the Autoregressive Model (or AR Model) that is the model that is supposed to be underlying the time-series:
Where Y is the time-series, the ϕi
are the autoregressive parameters and εt
is white noise.
The Optimal Linear Predictor computes all the parameters ϕi
and then extrapolates the time-series to future values.
You can download the following example of Linear Predictor
The example shows how effective the Linear Predictor is in several useful time-series.
Optimal Detrended Linear Predictor
The Optimal Detrended Linear Predictor is a digital filter that borrows from the Optimal Linear Predictor the basics.
This filter is particularly designed to forecast data that has a linear trend. The method essentially detrends the data before applying
the Optimal Linear Predictor operator. The forecast is then re-trended using the original trend parameters.