Balanced repeated measures - Grass

The data for this example is taken from the GenStat manual. It consists of a total of 5 measurements of height (cm) taken on 14 plants. The 14 plants were either diseased or healthy and were arranged in a glasshouse in a completely random design. The heights were measured 1, 3, 5, 7 and 10 weeks after the plants were placed in the glasshouse. There were 7 plants in each treatment. The data are depicted in the Figure obtained by qualifier line
!Y y1 !G tmt !JOIN in the following multivariate ASReml job.


Figure 1. Trellis plot of the height for each of 14 plants

In the following we illustrate how various repeated measures analyses can be conducted in ASReml. For these analyses it is convenient to arrange the data in a multivariate form, with 7 fields representing the plant number, treatment identification and the 5 heights. The ASReml input file, up to the specification of the R structure is
 This is plant data multivariate
  tmt   !A  # Diseased Healthy
  plant 14
  y1 y3 y5 y7 y10
 grass.asd !skip 1 !ASUV
The focus is modelling of the error variance for the data. Specifically we fit the multivariate regression model given by
Y = DT + E
where Y is a 14 by 5 matrix of heights, D is a 14 by 2 design matrix, T is a 2 by 5 matrix of fixed effects and E is a 14 by 5 matrix of errors. The heights taken on the same plants will be correlated and so we assume that
var(vec(E)) = I14 cross Σ
where Σ is a 5 by 5 symmetric positive definite matrix. The variance models used for Σ are given in the Table. These represent some commonly used models for the analysis of repeated measures data (see Wolfinger, 1986). The variance models are fitted by changing the last four lines of the input file. The model lines for the first model are
 y1 y3 y5 y7 y10 ~ Trait tmt Tr.tmt !r units
 1 2 0
 14
 Trait
Table Summary of variance models fitted to the plant data
number of REML
model parameters log-likelihood BIC
Uniform 2 -196.88 401.95
Power 2 -182.98 374.15
Heterogeneous Power 6 -171.50 367.57
Antedependence (order 1) 9 -160.37 357.51
Unstructured 15 -158.04 377.50
The split plot in time model can be fitted in two ways, either by fitting a units term plus an independent residual as above, or by specifying a CORU variance model for the R-structure as follows
 y1 y3 y5 y7 y10 ~ Trait tmt Tr.tmt
 1 2 0
 14
 Trait 0 CORU .5
The two forms for Σ are given by
units: Σ = σ21J + σ22I,
CORU: Σ = σ2eI + σ2eρ(J - I),

It follows that
σ2e = σ21+ σ22 and ρ = σ21/ (σ21+ σ22)

Portions of the two outputs are given below. The REML log-likelihoods for the two models are the same and it is easy to verify that the REML estimates of the variance parameters satisfy Eqn4, viz. σ2e= 286.310 which is about 159.858 + 126.528 = 286.386 and 159.858/286.386 = 0.558191.

 #
 # !r units
 #
  LogL=-204.593     S2=  224.61         60 df   0.1000      1.000
  LogL=-201.233     S2=  186.52         60 df   0.2339      1.000
  LogL=-198.453     S2=  155.09         60 df   0.4870      1.000
  LogL=-197.041     S2=  133.85         60 df   0.9339      1.000
  LogL=-196.881     S2=  127.56         60 df    1.204      1.000
  LogL=-196.877     S2=  126.53         60 df    1.261      1.000
  Final parameter values                        1.2634     1.0000

  Source         Model  terms     Gamma     Component    Comp/SE   % C
  units             14     14   1.26342       159.858       2.11   0 P
  Variance          70     60   1.00000       126.528       4.90   0 P
 #
 # CORU
 #
  LogL=-196.975     S2=  264.10         60 df    1.000     0.5000
  LogL=-196.924     S2=  270.14         60 df    1.000     0.5178
  LogL=-196.886     S2=  278.58         60 df    1.000     0.5400
  LogL=-196.877     S2=  286.23         60 df    1.000     0.5580
  LogL=-196.877     S2=  286.31         60 df    1.000     0.5582
  Final parameter values                        1.0000    0.55819

  Source        Model  terms     Gamma     Component    Comp/SE   % C
  Variance         70     60   1.00000       286.310       3.65   0 P
  Residual    CORRelat     5  0.558191      0.558191       4.28   0 U
A more realistic model for repeated measures data would allow the correlations to decrease as the lag increases such as occurs with the first order autoregressive model. However, since the heights are not measured at equally spaced time points we use the EXP model. The correlation function is given by ρu= φu where u is the time lag is weeks. The coding for this is
 y1 y3 y5 y7 y10 ~ Trait tmt Tr.tmt
 1 2 0          # One error structure in two dimensions
 14             # Outer dimension: 14 plants
 Tr 0 EXP .5
 1 3 5 7 10     # Time coordinates
A portion of the output is
  LogL=-183.734     S2=  435.58         60 df    1.000     0.9500
  LogL=-183.255     S2=  370.40         60 df    1.000     0.9388
  LogL=-183.010     S2=  321.50         60 df    1.000     0.9260
  LogL=-182.980     S2=  298.84         60 df    1.000     0.9179
  LogL=-182.979     S2=  302.02         60 df    1.000     0.9192
  Final parameter values                        1.0000    0.91897

  Source      Model  terms     Gamma     Component    Comp/SE   % C
  Variance       70     60   1.00000       302.021       3.11   0 P
  Residual  POW-EXP      5  0.918971      0.918971      29.53   0 U
When fitting power models be careful to ensure the scale of the defining variate, here time, does not result in an estimate of φ too close to 1. For example, use of days in this example would result in an estimate for φ of about .993.

The residual plot from this analysis is presented in in the figure.


{Figure 2. Residual plots for the EXP variance model for the plant data.

This suggests increasing variance over time. This can be modelled by using the EXPH model, which models Σ by Σ = D0.5CD0.5 where D is a diagonal matrix of variances and C is a correlation matrix with elements given by cij = φ|ti- tj|. The coding for this is
 y1 y3 y5 y7 y10 ~ Trait tmt Tr.tmt
 1 2 0
 14 !S2==1
 Tr 0 EXPH .5 100 200 300 300 300
 1 3 5 7 10
Note that it is necessary to fix the scale parameter to 1 ( !S2==1) to ensure that the elements of D are identifiable. Abbreviated output from this analysis is
    1 LogL=-195.598     S2=  1.0000         60 df    :   1 components constrained
    2 LogL=-179.036     S2=  1.0000         60 df
    3 LogL=-175.483     S2=  1.0000         60 df
    4 LogL=-173.128     S2=  1.0000         60 df
    5 LogL=-171.980     S2=  1.0000         60 df
    6 LogL=-171.615     S2=  1.0000         60 df
    7 LogL=-171.527     S2=  1.0000         60 df
    8 LogL=-171.504     S2=  1.0000         60 df
    9 LogL=-171.498     S2=  1.0000         60 df
   10 LogL=-171.496     S2=  1.0000         60 df

  Source                Model  terms     Gamma     Component    Comp/SE   % C
  Residual            POW-EXP      5  0.906917      0.906917      21.89   0 U
  Residual            POW-EXP      5   60.9599       60.9599       2.12   0 U
  Residual            POW-EXP      5   72.9904       72.9904       1.99   0 U
  Residual            POW-EXP      5   309.259       309.259       2.22   0 U
  Residual            POW-EXP      5   436.380       436.380       2.52   0 U
  Residual            POW-EXP      5   382.369       382.369       2.74   0 U
  Covariance/Variance/Correlation Matrix POWER
   61.11     0.8227     0.6769     0.5569     0.4156
   54.88      72.80     0.8227     0.6769     0.5051
   93.12      123.5      309.7     0.8227     0.6140
   91.02      120.7      302.7      437.1     0.7462
   63.57      84.34      211.4      305.3      382.9

                                    Wald F statistics
      Source of Variation                DF      Fic
    8 Trait                               5     127.95
    1 tmt                                 1       0.00
    9 Tr.tmt                              4       4.75
The last two models we fit are the antedependence model of order 1 and the unstructured model. These require, as starting values the lower triangle of the full variance matrix. We use the REML estimate of Σ from the heterogeneous power model shown in the previous output. The antedependence model models Σ by the inverse cholesky decomposition Σ-1 = UDU' where D is a diagonal matrix and U is a unit upper triangular matrix. For an antedependence model of order q, then uij = 0 for j>i+q-1. The antedependence model of order 1 has 9 parameters for these data, 5 in D and 4 in U. The input is given by
 y1 y3 y5 y7 y10 ~ Trait tmt Tr.tmt
 1 2 0
 14 !S2==1
 Tr 0 ANTE
   60.16
   54.65      73.65
   91.50      123.3      306.4
   89.17      120.2      298.6      431.8
   62.21      83.85      208.3      301.2      379.8
The abbreviated output file is
  1 LogL=-171.501     S2=  1.0000         60 df
  2 LogL=-170.097     S2=  1.0000         60 df
  3 LogL=-166.085     S2=  1.0000         60 df
  4 LogL=-161.335     S2=  1.0000         60 df
  5 LogL=-160.407     S2=  1.0000         60 df
  6 LogL=-160.370     S2=  1.0000         60 df
  7 LogL=-160.369     S2=  1.0000         60 df

  Source      Model  terms     Gamma     Component    Comp/SE   % C
  Residual  ANTE=UDU     1  0.268657E-01  0.268657E-01   2.44   0 U
  Residual  ANTE=UDU     1 -0.628413     -0.628413      -2.55   0 U
  Residual  ANTE=UDU     2  0.372801E-01  0.372801E-01   2.41   0 U
  Residual  ANTE=UDU     2  -1.49108      -1.49108      -2.54   0 U
  Residual  ANTE=UDU     3  0.599632E-02  0.599632E-02   2.43   0 U
  Residual  ANTE=UDU     3  -1.28041      -1.28041      -6.19   0 U
  Residual  ANTE=UDU     4  0.789713E-02  0.789713E-02   2.44   0 U
  Residual  ANTE=UDU     4 -0.967815     -0.967815     -15.40   0 U
  Residual  ANTE=UDU     5  0.390635E-01  0.390635E-01   2.45   0 U
  Covariance/Variance/Correlation Matrix ANTE=UDU'
   37.20     0.5946     0.3549     0.3114     0.3040
   23.38      41.55     0.5968     0.5237     0.5112
   34.83      61.89      258.9     0.8775     0.8565
   44.58      79.22      331.4      550.8     0.9761
   43.14      76.67      320.7      533.0      541.4

                                    Wald F statistics
      Source of Variation                DF      Fic
    8 Trait                               5     188.84
    1 tmt                                 1       4.14
    9 Tr.tmt                              4       3.91
The iterative sequence converged and the antedependence parameter estimates are printed columnwise by time, the column of U and the element of D. I.e.
D = diag( 0.0269 0.0373 0.0060 0.0079 0.0391 ),
U =
1.0 -0.6284 0 0 0
0 1 -1.4911 0 0
0 0 1 -1.2804 0
0 0 0 1 -0.9678
0 0 0 0 1 .
.
Finally the input and output files for the unstructured model are presented below. The REML estimate of Σ from the ANTE model is used to provide starting values.
 y1 y3 y5 y7 y10 ~ Trait tmt Tr.tmt
 1 2 0
 14 !S2==1
 Tr 0 US
   37.20
   23.38      41.55
   34.83      61.89      258.9
   44.58      79.22      331.4      550.8
   43.14      76.67      320.7      533.0      541.4



 1 LogL=-160.368     S2=  1.0000         60 df
 2 LogL=-159.027     S2=  1.0000         60 df
 3 LogL=-158.247     S2=  1.0000         60 df
 4 LogL=-158.040     S2=  1.0000         60 df
 5 LogL=-158.036     S2=  1.0000         60 df

  Source      Model  terms     Gamma     Component    Comp/SE   % C
  Residual US=UnStr     1   37.2262       37.2262       2.45   0 U
  Residual US=UnStr     1   23.3935       23.3935       1.77   0 U
  Residual US=UnStr     2   41.5195       41.5195       2.45   0 U
  Residual US=UnStr     1   51.6524       51.6524       1.61   0 U
  Residual US=UnStr     2   61.9169       61.9169       1.78   0 U
  Residual US=UnStr     3   259.121       259.121       2.45   0 U
  Residual US=UnStr     1   70.8113       70.8113       1.54   0 U
  Residual US=UnStr     2   57.6146       57.6146       1.23   0 U
  Residual US=UnStr     3   331.807       331.807       2.29   0 U
  Residual US=UnStr     4   551.507       551.507       2.45   0 U
  Residual US=UnStr     1   73.7857       73.7857       1.60   0 U
  Residual US=UnStr     2   62.5691       62.5691       1.33   0 U
  Residual US=UnStr     3   330.851       330.851       2.29   0 U
  Residual US=UnStr     4   533.756       533.756       2.42   0 U
  Residual US=UnStr     5   542.175       542.175       2.45   0 U
  Covariance/Variance/Correlation Matrix US=UnStructu
   37.23     0.5950     0.5259     0.4942     0.5194
   23.39      41.52     0.5969     0.3807     0.4170
   51.65      61.92      259.1     0.8777     0.8827
   70.81      57.61      331.8      551.5     0.9761
   73.79      62.57      330.9      533.8      542.2
The antedependence model of order 1 is clearly more parsimonious than the unstructured model. The next Table presents the incremental Wald tests for each of the variance models. There is a surprising level of discrepancy between models for the Wald tests. The main effect of treatment is significant for the uniform, power and antedependence models.

Table Summary of Wald test for fixed effects for variance models fitted to the plant data
treatment treatment.time
model (df=1) (df=4)
Uniform 9.41 5.10
Power 6.86 6.13
Heterogeneous power 0.00 4.81
Antedependence (order 1) 4.14 3.96
Unstructured 1.71 4.46
  • Back

    Return to start