Generalized Linear Models

Introduction

ASReml includes facilities for fitting the family of Generalised Linear Models (GLMs) of Nelder and McCullagh. GLMs are specified by qualifiers after the name of the dependent variable but before the ~ character.

A second dependent variable may be specified if a bivariate analysis is required but it will always be treated as a normal variate (no syntax is provided for specifying GLM attributes for it). The !ASUV qualifier is required in this situation for the GLM weights to be utilized.

The model is fitted by iteratively reweighted least squares doing 2 iterations of the estimation of effects in the linear model for each iteration for updating any variance parameters. This 2 can be increased with the !GLMM qualifier.

Please refer to the ASReml User Guide for algebraic details of the link functions, inverse link functions, variances and deviances for the various distributions.

Distribution and link qualifiers

The default link is listed first.

!NORMAL [ !IDENTITY | !LOGARITHM | !INVERSE ] The model is fitted on the log/inverse scale but the residuals are on the natural scale. !NORMAL !IDENTITY is the default.

!BINOMIAL [ !LOGIT | !IDENTITY | !PROBIT | !COMPLOGLOG ] [ !TOTAL n ]
A binary variate [0, 1] is indicated if !TOTAL is unspecified. Proportions or counts ( r ) are indicated if !TOTAL specifies the variate containing the binomial totals. Proportions are assumed if no response value exceeds 1. The logit is the default link function. The variance on the underlying scale is (π²)/3 (close to 3.29) (underlying logistic distribution) for the logit link.

!MULTINOMIAL k [ !CUMULATIVE ] [ !LOGIT | !IDENTITY | !PROBIT | !COMPLOGLOG ] [ !TOTAL n ]
fits an multiple threshold model with t=k-1 thresholds to polytomous ordinal data with k classes assuming a multinomial distribution. Typically, the response variable is a single variable containing the ordinal score (1:k) or a set of k variables containing counts ( r_i) in the k classes. The response may also be a series of t binary variables or a series of t variables containing counts. If t counts are supplied, the total (including the kth class) must be given in another variable indicated by the !TOTAL qualifier.

The threshold model is fitted as a cumulative probability model. The proportions ( y_ir_in ) in the ordered classes are summed to form the cumulative proportions ( Y_i) which are modelled as logit ( !LOGIT), probit ( !PROBIT) or Complementary LogLog ( !CLOG) variables. The implicit residual variance on the underlying scale is π²/3 ~ 3.3 (underlying logistic distribution) for the logit link, 1 for the probit link. The distribution underlying the Complementary LogLog link is the Gumbel distribution with implicit residual variance on the underlying svale of π²/6 ~ 1.65

Predicted values are reported for the cumulative proportions. For Example

 Lodging !MULT 4 !CUM ~ Trait Variety !r block
 predict Variety

where Lodging is a factor with 4 ordered classes.

!POISSON [ !LOGARITHM | !IDENTITY | !SQRT ]
Natural logarithms are the default link function. ASReml assumes the poisson variable is not negative

!GAMMA [ !INVERSE | !IDENTITY | !LOGARITHM ] [ !PHI phi ]
The inverse is the default link function. The default value of phi is 1.

!NEGBIN [ !LOGARITHM | !IDENTITY | !INVERSE ] [ !PHI phi ]
Natural logarithms are the default link function. The default value of φ is 1.

General qualifiers

!AOD
requests an Analysis of Deviance table be generated. This is formed by fitting a series of sub models for terms in the DENSE part building up to the full model, and comparing the deviances. It is not available in association with the PREDICT. For example
LS !BIN !TOT COUNT !AOD ~ mu SEX GROUP

!DISP [h ]
includes an overdispersion scaling parameter (h) in the weights. If !DISP is specified with no argument, ASReml estimates it as the residual variance of the working variable. Traditionally it is estimated from the deviance residuals, reported by ASReml as Variance heterogeneity. For example,
count !POIS !DISP ~ mu group

!OFFSET [ o ]
is used especially with binomial data to include an offset in the model where o is the number or name of a variable in the data. The offset is only included in binomial and poisson models (for Normal models just subtract the offset variable from the response variable), for example
count !POIS !OFFSET base !disp ~ mu group
The offset will often be something like ln(n).

!TOTAL [v]
is used especially with binomial data where v is the field containing the total counts for each sample. If omitted, count is taken as 1.

Residual qualifiers

These control the form of the residuals returned in the .yht file. The predicted values returned in the yht file will be on the linear predictor scale if the !WORK or !PVW qualifiers are used. They will be on the observation scale if the !DEVIANCE, !PEARSON, !RESPONSE or !PVR qualifiers are used.

!DEVIANCE
produces deviance residuals, the signed square root of d/h where d is the deviance and h is the dispersion parameter controlled by the !DISP qualifier. This is the default.

!PEARSON produces Pearson residuals, (y-mu)/sqrt(v)

!RESPONSE
produces simple residuals, y-mu

!WORK produces residuals on the linear predictor scale, (y-mu)/ (dmu/dy).

Generalized Linear Models

Introduction

Distribution and link qualifiers

General qualifiers

Residual qualifiers

See Also