Multiple Regression

Regression in statistics refers to the capability of modeling and relating a series of one or more independent variables with one target dependent variable. The regression equation can be thought of as a function of variables X (the independent variables) and β (the parameters of the regression):

$Y = f(X,β)$

where Y is the dependent variable.

Subject of the Regression Analysis is then finding the parameters β so that the sum of squared error residuals

$Residuals = \sum_i ε_i^2 \; \; (where\, ε_i = y_i - \hat y_i)$

is minimized. This procedure is sometimes referred to as least squares estimation.

Linear Multiple Regression

The linear regression assumes that the relationships between independent variables are all linear that is all the β coefficients appear with power one. The following equations:

$\begin{align} y_i &= β_0 + β_1x_1 + ε_i \\ y_i &= β_0 + β_1 log(x_1) + β_2 e^{x_2} + β_3 x_3^2 + ε_i \end{align}$

are both linear regressions even though the independent variables are combined using non-linear functions or powers because both expressions are linear in the parameters β of the regression.

The general linear data model:

$y_i = β_0 + β_1x_1 + β_2x_2 + β_3x_3 + \ldots + ε_i$

can be solved with respect to parameters β by solving the normal matrix equation:

$β_{est} = (X^T X)^{-1} X^T y$

You can download the following example of Linear Regression built using the methods available in Ipredict’s library. The example builds a very simple one week ahead predictor of the Forex EUR/USD.

Logistic Regression

The logistic regression is used when the dependent variable assumes only two values partitioning in effect the output like in life/death, male/female or buy/sell decisions. Then:

$\log(odds) = logit(P) = log({p_i \over {1.0 – p_i}}) = β_0 + β_1x_1 + β_2x_2 + β_3x_3 + \cdots$

is the regression equation that will be minimized using a least squares approach as in the linear regression case.

You can download the following example of Logistic Regression. The example builds a simple one week ahead predictor of the Forex EUR/USD using a Logistic Regression.

Tikhonov regularization

Tikhonov regularization, also known as Ridge Regression, is a common method used to regularize ill-posed problems.

The linear matrix equation

$β_{est} = (X^T X)^{-1} X^T y$

requires the inversion of the matrix

$X^T X$

that can be ill-conditioned or singular (does not have an inverse). In this case a solution can be found by solving an alternative problem:

$β_{δest} = (X^T X + \delta I)^{-1} X^T y$

where I is the identity matrix. Of course for $\delta=0$ this reduces to the standard unregularized least squares regression. The optimal determination of the parameter $\delta$ is a very complex problem and this parameter is normally determined using manual or ad-hoc methods.

Optimal Linear Predictor

The Optimal Linear Predictor is a digital filter that extrapolates linearly from past values the future values of a time-series. This is related to the Autoregressive Model (or AR Model) that is the model that is supposed to be underlying the time-series:

$Y_t = \sum_i \phi_iY_{t-i} + \epsilon_t$

Where Y is the time-series, the $\phi_i$ are the autoregressive parameters and $\epsilon_t$ is white noise. The Optimal Linear Predictor computes all the parameters $\phi_i$ and then extrapolates the time-series to future values.

You can download the following example of Linear Predictor. The example shows how effective the Linear Predictor is in several useful time-series.

Optimal Detrended Linear Predictor

The Optimal Detrended Linear Predictor is a digital filter that borrows from the Optimal Linear Predictor the basics. This filter is particularly designed to forecast data that has a linear trend. The method essentially detrends the data before applying the Optimal Linear Predictor operator. The forecast is then re-trended using the original trend parameters.

Forecasting Methods		Holt Winter’s, Series Decomposition and Wavelet Benchmarks
Time Series Forecasting		Use of the Moving Average in Time-series Forecasting
Forecasting Concepts		Denoising Techniques
Error Statistics		Computational Performance
Fast Fourier Transform		Moving Averages
Kernel Smoothing		Active Moving Average
Savitsky-Golay Smoothing		Fractal Projection
Downloading Financial Data from Yahoo		Multiple Regression
Digital Signal Processing		Principal Component Analysis
Curve Analysis		Options Pricing with Black-Scholes
Markowitz Optimal Portfolio		Time-series preprocessing

iPredict

Time-series forecasting software