Multiple Regression
Regression in statistics refers to the capability of modeling and relating a series of one or more independent variables with one target dependent variable.
The regression equation can be thought of as a function of variables X (the independent variables) and β (the parameters of the regression):
Y = f(X,β)
where Y is the dependent variable.
Subject of the Regression Analysis is then finding the parameters β so that the sum of squared error residuals
Residuals = Σ
i ε
i2 (where ε
i = y
i - ŷ
i)
is minimized. This procedure is sometimes referred to as
least squares estimation.
Linear Multiple Regression
The linear regression assumes that the relationships between independent variables are all linear that is all the β coefficients appear with power one.
The following equations:
y
i = β
0 + β
1x
1 + ε
i
y
i = β
0 + β
1 log(x
1) + β
2 exp(x
2) + β
3 x
32 + ε
i
are both linear regressions even though the independent variables are combined using non-linear functions or powers because both
expressions are linear in the parameters β of the regression.
The
general linear data model:
y
i = β
0 + β
1x
1 + β
2x
2 + β
3x
3 + … + ε
i
can be solved with respect to parameters β by solving the normal matrix equation:
β
est = (X
T X)
-1 X
T y
You can download the following example of
Linear Regression built using the methods available in Ipredict's library.
The example builds a very simple one week ahead predictor of the Forex EUR/USD.
Logistic Regression
The logistic regression is used when the dependent variable assumes only two values partitioning in effect the output
like in life/death, male/female or buy/sell decisions. Then:
log(odds) = logit(P) = log(p
i / (1.0 – p
i)) = β
0 + β
1x
1 + β
2x
2 + β
3x
3 + …
is the regression equation that will be minimized using a
least squares approach as in the linear regression case.
You can download the following example of
Logistic Regression.
The example builds a simple one week ahead predictor of the Forex EUR/USD using a Logistic Regression.
Tikhonov regularization
Tikhonov regularization, also known as Ridge Regression, is a common method used to regularize ill-posed problems.
The linear matrix equation
β
est = (X
T X)
-1 X
T y
requires the inversion of the matrix
(X
T X)
that can be ill-conditioned or singular (does not have an inverse). In this
case a solution can be found by solving an alternative problem:
β
δest = (X
T X + δ I)
-1 X
T y
where I is the identity matrix. Of course for δ=0 this reduces to the standard unregularized least squares regression.
The optimal determination of the parameter δ is a very complex problem and this
parameter is normally determined using manual or ad-hoc methods.
Optimal Linear Predictor
The Optimal Linear Predictor is a digital filter that extrapolates linearly from past values the future values of a time-series.
This is related to the Autoregressive Model (or AR Model) that is the model that is supposed to be underlying the time-series:
Y
t = ∑
i ϕ
iY
t-i + ε
t
Where Y is the time-series, the ϕ
i are the autoregressive parameters and ε
t is white noise.
The Optimal Linear Predictor computes all the parameters ϕ
i and then extrapolates the time-series to future values.
You can download the following example of
Linear Predictor.
The example shows how effective the Linear Predictor is in several useful time-series.
Optimal Detrended Linear Predictor
The Optimal Detrended Linear Predictor is a digital filter that borrows from the Optimal Linear Predictor the basics.
This filter is particularly designed to forecast data that has a linear trend. The method essentially detrends the data before applying
the Optimal Linear Predictor operator. The forecast is then re-trended using the original trend parameters.