Regression analysis
is used to model the relationship between a response
variable and one or more predictor variables. STATGRAPHICS Centurion
provides a large number of procedures for fitting different
types of regression models:
1.
Simple Regression  fits
linear and nonlinear models with one predictor.
Includes both least squares and resistant methods.
2.
BoxCox Transformations 
fits a linear model with one predictor, where the Y
variable is transformed to achieve approximate
normality.
3.
Polynomial Regression
 fits a polynomial model with one predictor.
4.
Calibration Models 
fits a linear model with one predictor and then
solves for X given Y.
4.
Multiple Regression 
fits linear models with two or more predictors.
Includes an option for forward or backward stepwise
regression and a BoxCox or CochraneOrcutt
transformation.
5.
Comparison of Regression Lines
 fits regression lines for one predictor at each
level of a second predictor. Tests for significant
differences between the intercepts and slopes.
6.
Regression Model Selection
 fits all possible regression models for multiple
predictor variables and ranks the models by the
adjusted Rsquared or Mallows' Cp statistic.
7.
Ridge Regression  fits a
multiple regression model using a method designed to
handle correlated predictor variables.
8.
Nonlinear Regression 
fits a userspecified model involving one or more
predictors.
9.
Partial Least Squares  fits
a multiple regression model using a method that
allows more predictors than observations.
10.
General Linear Models  fits
linear models involving both quantitative and
categorical predictors.
11.
Life Data Regression 
fits regression models for response variables that
represent failure times. Allows for censoring and
nonnormal error distributions.
12.
Regression Analysis for
Proportions  fits logistic and probit
models for binary response data.
13.
Regression Analysis for Counts
 fits Poisson and negative binomial regression
models.
Simple Regression
The simplest
regression models involve a single response variable Y
and a single predictor variable X. STATGRAPHICS will fit
a variety of functional forms, listing the models in
decreasing order of Rsquared. If outliers are
suspected, resistant methods can be used to fit the
models instead of least squares.
Comparison of
Alternative Models
Model 
RSquared 
SquaredY
reciprocalX 
87.75% 
ReciprocalX 
87.11% 
Square rootY
reciprocalX 
86.71% 
Scurve model 
86.27% 
Double reciprocal 
85.25% 
ReciprocalY
logarithmicX 
84.99% 
Multiplicative 
84.98% 
LogarithmicX 
84.77% 
SquaredY
logarithmicX 
84.36% 
ReciprocalY
square rootX 
81.69% 
LogarithmicY
square rootX 
81.21% 
Square rootX 
80.54% 
SquaredY square
rootX 
79.68% 
ReciprocalY 
76.73% 
Exponential 
75.87% 
Square rootY 
75.37% 
Logistic 
75.08% 
Log probit 
75.03% 
Linear 
74.83% 
SquaredY 
73.63% 
ReciprocalY
squaredX 
64.37% 
LogarithmicY
squaredX 
63.05% 
Square rootY
squaredX 
62.34% 
SquaredX 
61.60% 
Double squared 
60.04% 
BoxCox Transformations
When the response
variable does not follow a normal distribution, it is
sometimes possible to use the methods of Box and Cox to
find a transformation that improves the fit. Their
transformations are based on powers of Y. STATGRAPHICS
will automatically determine the optimal power and fit
an appropriate model.
Polynomial Regression
Another approach to
fitting a nonlinear equation is to consider polynomial
functions of X. For interpolative purposes, polynomials
have the attractive property of being able to
approximate many kinds of functions.
Calibration Models
In a typical
calibration problem, a number of known samples are
measured and an equation is fit relating the
measurements to the reference values. The fitted
equation is then used to predict the value of an unknown
sample by generating an inverse prediction (predicting X
from Y) after measuring the sample.
Multiple Regression
The Multiple
Regression procedure fits a model relating a
response variable Y to multiple predictor variables X1,
X2, ... . The user may include all predictor variables
in the fit or ask the program to use a stepwise
regression to select a subset containing only
significant predictors. At the same time, the BoxCox
method can be used to deal with nonnormality and the
CochraneOrcutt procedure to deal with autocorrelated
residuals.
Comparison of Regression Lines
In some situations,
it is necessary to compare several regression lines.
STATGRAPHICS will fit parallel or nonparallel linear
regressions for each level of a "BY" variable and
perform statistical tests to determine whether the
intercepts and/or slopes of the lines are significantly
different.
Regression Model Selection
If the number of
predictors is not excessive, it is possible to fit
regression models involving all combinations of 1
predictor, 2 predictors, 3 predictors, etc, and sort the
models according to a goodnessof fit statistic. In
STATGRAPHICS, the Regression Model Selection
procedure implements such a scheme, selecting the models
which give the best values of the adjusted RSquared or
of Mallows' Cp statistic.
Ridge Regression
When the predictor
variables are highly correlated amongst themselves, the
coefficients of the resulting least squares fit may be
very imprecise. By allowing a small amount of bias in
the estimates, more reasonable coefficients may often be
obtained. Ridge regression is one method to address
these issues. Often, small amounts of bias lead to
dramatic reductions in the variance of the estimated
model coefficients.
Nonlinear Regression
Most least squares
regression programs are designed to fit models that are
linear in the coefficients. When the analyst wishes to
fit an intrinsically nonlinear model, a numerical
procedure must be used. The STATGRAPHICS Nonlinear Least
Squares procedure used an algorithm due to Marquardt to
fit any function entered by the user.
Partial Least Squares
Partial Least
Squares is designed to construct a statistical model
relating multiple independent variables X to multiple
dependent variables Y. The procedure is most helpful
when there are many predictors and the primary goal of
the analysis is prediction of the response variables.
Unlike other regression procedures, estimates can be
derived even in the case where the number of predictor
variables outnumbers the observations. PLS is widely
used by chemical engineers and chemometricians for
spectrometric calibration.
General Linear Models
The GLM procedure is
useful when the predictors include both quantitative and
categorical factors. When fitting a regression model, it
provides the ability to create surface and contour plots
easily.
Life Data Regression
To describe the impact of
external variables on failure times, regression models may
be fit. Unfortunately, standard least squares techniques do
not work well for two reasons: the data are often censored,
and the failure time distribution is rarely Gaussian. For
this reason, STATGRAPHICS provides a special procedure that
will fit life data regression models with censoring,
assuming either an exponential, extreme value, logistic,
loglogistic, lognormal, normal or Weibull distribution.
Regression Analysis for
Proportions
When the response
variable is a proportion or a binary value (0 or 1),
standard regression techniques must be modified.
STATGRAPHICS provides two important procedures for this
situation: Logistic Regression and Probit
Analysis. Both methods yield a prediction equation
that is constrained to lie between 0 and 1.
Regression Analysis for Counts
For response
variables that are counts, STATGRAPHICS provides two
procedures: a Poisson Regression and a
Negative Binomial Regression. Each fits a loglinear
model involving both quantitative and categorical
predictors.
