Spatial analysis of a field experiment - Barley
Context: Formal Examples
Introduction
In this section we illustrate the ASReml syntax for performing
spatial and incomplete block analysis of a field experiment. There has
been a large amount of interest in developing techniques for the
analysis of spatial data both in the context of field experiments and
geostatistical data (see for example, Cullis and Gleeson,
1991; Cressie, 1991; Gilmour et al., 1997). This example
illustrates the analysis of 'so-called' regular spatial data, in which
the data is observed on a lattice or regular grid. This is typical of
most small plot designed field experiments. Spatial data is often
irregularly spaced, either by design or because of the observational
nature of the study. The techniques we present in the following can be
extended for the analysis of irregularly spaced spatial data, though,
larger spatial data sets may be computationally challenging, depending
on the degree of irregularity or models fitted.
The data we consider is taken from Gilmour et al. (1995) and
involves a field experiment designed to compare the performance of 25
varieties of barley. The experiment was conducted at Slate Hall Farm,
UK in 1976, and was designed as a balanced lattice square with
replicates laid out as shown in the Table. The data fields
were Rep, RowBlk, ColBlk, row, column and yield. Lattice row
and column numbering is typically within
replicates and so the terms specified in the linear model to account
for the lattice row and lattice column effects would be
Rep.latticerow Rep.latticecolumn. However, in this example lattice rows
and columns are both numbered from 1 to 30 across replicates
(see Table). The terms in the
linear model are therefore simply RowBlk ColBlk. Additional
fields row and column indicate the
spatial layout of the plots.
The ASReml input file is presented below. Three models have been
fitted to these data. The lattice analysis is included for comparison
in PATH 3. In PATH 1 we use the separable first order
autoregressive model to model the variance structure of the plot
errors. Gilmour et al. (1997) suggest this is often a useful
model to commence the spatial modelling process. The form of the
variance matrix for the plot errors (R structure) is given by
σ2Σ = σ2(Σccross Σr)
where Σc and Σr are 15 by 15 and
10 by 10 matrix functions of the column (φc) and row
(φr) autoregressive parameters respectively.
Gilmour et al. (1997) recommend revision of the current spatial
model based on the use of diagnostics such as the sample variogram of
the residuals (from the current model). This diagnostic and a
summary of row and column residual trends are
produced by default with graphical versions of ASReml when a spatial model has been
fitted to the errors. It can be suppressed, by the use of the -n
option on the command line. We have produced the following plots by
use of the -g22 option.
Field layout of Slate Hall Farm experiment
Column - Replicate levels
Row | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15
|
|
1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3
|
2 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3
|
3 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3
|
4 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3
|
5 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3
|
6 | 4 | 4 | 4 | 4 | 4 | 5 | 5 | 5 | 5 | 5 | 6 | 6 | 6 | 6 | 6
|
7 | 4 | 4 | 4 | 4 | 4 | 5 | 5 | 5 | 5 | 5 | 6 | 6 | 6 | 6 | 6
|
8 | 4 | 4 | 4 | 4 | 4 | 5 | 5 | 5 | 5 | 5 | 6 | 6 | 6 | 6 | 6
|
9 | 4 | 4 | 4 | 4 | 4 | 5 | 5 | 5 | 5 | 5 | 6 | 6 | 6 | 6 | 6
|
10 | 4 | 4 | 4 | 4 | 4 | 5 | 5 | 5 | 5 | 5 | 6 | 6 | 6 | 6 | 6
|
|
Row | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15
|
|
1 | 1 | 1 | 1 | 1 | 1 | 11 | 11 | 11 | 11 | 11 | 21 | 21 | 21 | 21 | 21
|
2 | 2 | 2 | 2 | 2 | 2 | 12 | 12 | 12 | 12 | 12 | 22 | 22 | 22 | 22 | 22
|
3 | 3 | 3 | 3 | 3 | 3 | 13 | 13 | 13 | 13 | 13 | 23 | 23 | 23 | 23 | 23
|
4 | 4 | 4 | 4 | 4 | 4 | 14 | 14 | 14 | 14 | 14 | 24 | 24 | 24 | 24 | 24
|
5 | 5 | 5 | 5 | 5 | 5 | 15 | 15 | 15 | 15 | 15 | 25 | 25 | 25 | 25 | 25
|
6 | 6 | 6 | 6 | 6 | 6 | 16 | 16 | 16 | 16 | 16 | 26 | 26 | 26 | 26 | 26
|
7 | 7 | 7 | 7 | 7 | 7 | 17 | 17 | 17 | 17 | 17 | 27 | 27 | 27 | 27 | 27
|
8 | 8 | 8 | 8 | 8 | 8 | 18 | 18 | 18 | 18 | 18 | 28 | 28 | 28 | 28 | 28
|
9 | 9 | 9 | 9 | 9 | 9 | 19 | 19 | 19 | 19 | 19 | 29 | 29 | 29 | 29 | 29
|
10 | 10 | 10 | 10 | 10 | 10 | 20 | 20 | 20 | 20 | 20 | 30 | 30 | 30 | 30 | 30
|
|
Row | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15
|
|
1 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15
|
2 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15
|
3 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15
|
4 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15
|
5 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15
|
6 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30
|
7 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30
|
8 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30
|
9 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30
|
10 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30
|
Slate Hall example
Rep 6 # Six replicates of 5x5 plots in 2x3 arrangement
RowBlk 30 # Rows within replicates numbered across replicates
ColBlk 30 # Columns within replicates numbered across replicates
row 10 # Field row
column 15 # Field column
variety 25
yield
barley.asd !skip 1 !DOPATH 1
!PATH 1 # AR1 x AR1
y ~ mu var
1 2
15 column AR1 0.1 # Second field is specified so ASReml can sort
10 row AR1 0.1 # records properly into field order
!PATH 2 # AR1 x AR1 + units
y ~ mu var !r units
1 2
15 column AR1 0.1
10 row AR1 0.1
!PATH 3 # incomplete blocks
y ~ mu var !r Rep Rowblk Colblk
!PATH 0
predict variety !TWOSTAGEWEIGHTS
Abbreviated ASReml output file is presented
below. The iterative sequence has converged to column and row
correlation parameters of (.68377,.45859) respectively. The plot size
and orientation is not known and so it is not possible to ascertain
whether these values are spatially sensible. It is generally found
that the closer the plot centroids, the higher the spatial
correlation. This is not always the case and if the highest between
plot correlation relates to the larger spatial distance then this may
suggest the presence of extraneous variation (see Gilmour
et al., 1997), for example. Figure presents a plot of the
sample variogram of the residuals from this model. The plot appears in
reasonable agreement with the model.
The next model includes a measurement error or nugget effect
component. That is the variance model for the plot errors is now given
by
σ2Σ = σ2(Σccross Σr +
ψI150)
where ψ is the ratio of nugget variance to error variance
(σ2). The abbreviated output for this model is given
below. There is a significant improvement in the REML Loglikelihood with the
inclusion of the nugget effect (see Table).

Figure 1 Sample variogram of the residuals from the AR1 cross AR1 model for the Slate Hall data
# AR1 x AR1
#
1 LogL=-739.681 S2= 36034. 125 df 1.000 0.1000 0.1000
2 LogL=-714.340 S2= 28109. 125 df 1.000 0.4049 0.1870
3 LogL=-703.338 S2= 29914. 125 df 1.000 0.5737 0.3122
4 LogL=-700.371 S2= 37464. 125 df 1.000 0.6789 0.4320
5 LogL=-700.324 S2= 38602. 125 df 1.000 0.6838 0.4542
6 LogL=-700.322 S2= 38735. 125 df 1.000 0.6838 0.4579
7 LogL=-700.322 S2= 38754. 125 df 1.000 0.6838 0.4585
8 LogL=-700.322 S2= 38757. 125 df 1.000 0.6838 0.4586
Final parameter values 1.0000 0.68377 0.45861
Source Model terms Gamma Component Comp/SE % C
Variance 150 125 1.00000 38756.6 5.00 0 P
Residual AR=AutoR 15 0.683767 0.683767 10.80 0 U
Residual AR=AutoR 10 0.458607 0.458607 5.55 0 U
Wald F statistics
Source of Variation NumDF DenDF F-inc Prob
8 mu 1 12.8 850.88 <.001
6 variety 24 80.0 13.04 <.001
# AR1 x AR1 + units
1 LogL=-740.735 S2= 33225. 125 df : 2 components constrained
2 LogL=-723.595 S2= 11661. 125 df : 1 components constrained
3 LogL=-698.498 S2= 46239. 125 df
4 LogL=-696.847 S2= 44725. 125 df
5 LogL=-696.823 S2= 45563. 125 df
6 LogL=-696.823 S2= 45753. 125 df
7 LogL=-696.823 S2= 45796. 125 df
Source Model terms Gamma Component Comp/SE % C
units 150 150 0.106154 4861.48 2.72 0 P
Variance 150 125 1.00000 45796.3 2.74 0 P
Residual AR=AutoR 15 0.843795 0.843795 12.33 0 U
Residual AR=AutoR 10 0.682686 0.682686 6.68 0 U
Wald F statistics
Source of Variation NumDF DenDF F-inc Prob
8 mu 1 3.5 259.81 <.001
6 variety 24 75.7 10.21 <.001
The lattice analysis (with recovery of between block information) is
presented below. This variance model is not competitive with the
preceding spatial models. The models can be formally compared using
the BIC values for example.
# IB analysis
1 LogL=-734.184 S2= 26778. 125 df
2 LogL=-720.060 S2= 16591. 125 df
3 LogL=-711.119 S2= 11173. 125 df
4 LogL=-707.937 S2= 8562.4 125 df
5 LogL=-707.786 S2= 8091.2 125 df
6 LogL=-707.786 S2= 8061.8 125 df
7 LogL=-707.786 S2= 8061.8 125 df
- - - Results from analysis of yield - - -
Approximate stratum variance decomposition
Stratum Degrees-Freedom Variance Component Coefficients
Rep 5.00 266657. 25.0 5.0 5.0 1.0
RowBlk 24.00 74887.8 0.0 4.3 0.0 1.0
ColBlk 23.66 71353.5 0.0 0.0 4.3 1.0
Residual Variance 72.34 8061.81 0.0 0.0 0.0 1.0
Source Model terms Gamma Component Comp/SE % C
Rep 6 6 0.528714 4262.39 0.62 0 P
RowBlk 30 30 1.93444 15595.1 3.06 0 P
ColBlk 30 30 1.83725 14811.6 3.04 0 P
Variance 150 125 1.00000 8061.81 6.01 0 P
Wald F statistics
Source of Variation NumDF DenDF F-inc Prob
8 mu 1 5.0 1216.29 <.001
6 variety 24 79.3 8.84 <.001
Finally, we present portions of the .pvs files to illustrate the
prediction facility of ASReml . The first five and last three
variety means are presented for illustration.
The overall SED printed is the square root
of the average variance of difference between the variety means.
The two spatial analyses
have a range of SEDs which are available if the !SED
qualifier is used.
All variety comparisons have the same SED
from the third analysis as the design is a balanced lattice square. The F-statistic
statistics
for the spatial models are greater than for the lattice analysis. We note
the F-statistic for the AR1 cross AR1 + units model is smaller than the
F-statistic for the AR1 cross AR1.
Predicted values of yield
#AR1 x AR1
variety PredictedVlue StandardEror Ecode
1.0000 1257.9763 64.6146 E
2.0000 1501.4483 64.9783 E
3.0000 1404.9874 64.6260 E
4.0000 1412.5674 64.9027 E
5.0000 1514.4764 65.5889 E
. . .
23.0000 1311.4888 64.0767 E
24.0000 1586.7840 64.7043 E
25.0000 1592.0204 63.5939 E
SED: Overall Standard Error of Difference 59.05
#AR1 x AR1 + units
variety PredictedVlue StandardEror Ecode
1.0000 1245.5843 97.8591 E
2.0000 1516.2331 97.8473 E
3.0000 1403.9863 98.2398 E
4.0000 1404.9202 97.9875 E
5.0000 1471.6197 98.3607 E
. . .
23.0000 1316.8726 98.0402 E
24.0000 1557.5278 98.1272 E
25.0000 1573.8920 97.9803 E
SED: Overall Standard Error of Difference 60.51
# IB
Rep is ignored in the prediction
RowBlk is ignored in the prediction
ColBlk is ignored in the prediction
variety PredictedVlue StandardEror Ecode
1.0000 1283.5870 60.1994 E
2.0000 1549.0133 60.1994 E
3.0000 1420.9307 60.1994 E
4.0000 1451.8554 60.1994 E
5.0000 1533.2749 60.1994 E
. . .
23.0000 1329.1088 60.1994 E
24.0000 1546.4699 60.1994 E
25.0000 1630.6285 60.1994 E
SED: Overall Standard Error of Difference 62.02
Notice the differences in SE and SED associated with the
various models. Choosing a model on the basis of smallest SE or
SED is not recommended because the model is not necessarily
fitting the variability present in the data.
Summary of models for the Slate Hall data
| REML | number of | |
|
model | log-likelihood | parameters | F-statistic | SED
|
|
AR1 cross AR1 | -700.32 | 3 | 13.04 | 59.0
|
AR1 cross AR1 + units | -696.82 | 4 | 10.22 | 60.5
|
IB | -707.79 | 4 | 8.84 | 62.0
|
The predict statement included the qualifier
!TWOSTAGEWEIGHTS.
This generates an extra table in the .pvs file which we
now display for each model.
Predicted values with Effective Replication assuming
Variance= 38754.26
Heron: 1 1257.98 22.1504
Heron: 2 1501.45 20.6831
Heron: 3 1404.99 22.5286
Heron: 4 1412.57 22.7623
Heron: 5 1514.48 21.1830
. . . .
Heron: 25 1592.02 26.0990
Predicted values with Effective Replication assuming
Variance= 45796.58
Heron: 1 1245.58 23.8842
Heron: 2 1516.24 22.4423
Heron: 3 1403.99 24.1931
Heron: 4 1404.92 24.0811
Heron: 5 1471.61 23.2995
. . . .
Heron: 25 1573.89 26.0505
Predicted values with Effective Replication assuming
Variance= 8061.808
Heron: 1 1283.59 4.03145
Heron: 2 1549.01 4.03145
Heron: 3 1420.93 4.03145
Heron: 4 1451.86 4.03145
Heron: 5 1533.27 4.03145
. . . .
Heron: 25 1630.63 4.03145
The value of 4 for the IB analysis is clearly reasonable given
there are 6 actual replicates but this analysis has used up
48 degrees of freedom for the rowblk and colblk
effects. The values from the spatial analyses are similar
but much higher reflecting the gain in accuracy from the
spatial analysis.
Back
Return to start