G-structures

Introduction

Each random term in the linear model has default variance structure. In most cases, this is a scaled Identity. The exceptions are:
    the giv() model term associates a GIV matrix with the term,
     an A-inverse matrix is associated by default with !P pedigree factors unless it is expressed with the ide() model term.

A G-structure definition is required to assign some other variance structure to a model term. Usually a G structure is defined as a direct product of variance structures. Each G-structure definition consists of
  • G-structure header which identifies a model term and the number of
  • components in the direct product structure,
  • variance structure definitions for each component.
  • The Number of G structure definitions
  • must be given as the third field on the Variance Header line so that ASReml knows how many definitons follow.

    A variance structure definition consists of
    Size Sortkey VCODE [ qualifiers ] initialvlues
    Sortkey is usually 0 in G structures (but for spatial correlation models it points the the spatial coordinates).

    Sample G-structure definitions

    Multivariate sire model


    A typical variance structure, assuming three traits, can be written as
     Trait.sire 2       # G header: Model term, number of components
     Trait 0 US  !GP    # First structure definition
     6*0                # Uses internal estimates for initial values
     sire               # Second structure definition: sire 0 ID
    
    where
  • Trait.sire is the model term to which the structure applies and
  • 2 is the number of components in the direct product, that is, the number of variance structure definitions that follow.

    The order of the structure defintions must agree with the order of the effects, Trait then sire.
  • Trait specifies that the Size of this matrix is the number of levels in the factor Trait. The size may be given explicitly but using the factor name means that this bit of code does not need to be changed if the number of traits is changed, and makes it clear that the following variance structure pertains to the Trait dimension of Trait.sire.
  • 0 specifies a value for Sortkey. It is usually 0 in G structures.
  • US is the VCODE for an unstructured variance matrix.
  • !GP is a qualifier specifying that the estimated unstructured variance matrix must be kept positive definite.
  • 6*0 Assuming 3 traits, the US structure requires six initial values: 3 variances and 3 covariances in the order
    V11
    C21 V22
    C31 C32 V22
    (lower triangle rowwise). However, it is generally difficult to guess suitable values so we have supplied initial values of zeros (6*0 is six zeros) and ASReml will obtain initial values as a proportion of the simple variances and covariances of the residual.
  • sire specifies the levels in the second variance structure. If the second and third fields of a structure definition are omitted, the structure is taken as an Identity. ID is the VCODE for an Identity matrix.

    Multivariate animal model


    A typical variance structure, assuming three traits, can be written as
     Trait.animal 2       # Model term, number of components
     Trait 0 US  !GP
     6*0
     animal 0 AINV        # AINV is the fixed A inverse formed using the pedigree
    
    where Trait.animal is the model term to which the structure applies and 2 is the number of components in the direct product.

    The order the components are defined must agree with the order of the effects, Trait then animal.
  • US is the VCODE for an unstructured variance matrix. It requires six initial values: 3 variances and 3 covariances in the order V11 C21 V22 C31 C32 V22 (lower triangle rowwise). However, it is generally difficult to guess suitable values so we have supplied initial values of zeros (6*0 is six zeros) and ASReml will obtain initial values as a proportion of the simple variances and covariances of the residual.
  • AINV is the VCODE for the inverse numerator relationship matrix generated form the pedigree and associated with animal by the !P factor definition qualifier.

    Genetic correlation across sites.


    As a more complicated example, consider the analysis of say 50 variety trials where most varieties occur at most sites and all varieties occur at at least 2 sites.

    Ultimately, we want to fit a factor analytic model but to get starting values for that, we first fit a uniform covariance model.
     site.variety 2              # Model term, number of components
     site 0 CORUV .1 1
     variety                     # variety 0 ID
    
    where site.variety is the model term to which the structure applies and 2 is the number of components in the direct product.

    The order the components are defined must agree with the order of the effects, site then variety.
  • CORUV is the VCODE for a Uniform CORrelation matrix scaled by a single Varince. This is a simple model although it may take a while to run (equivalently, have two model terms variety site.variety and no explicit G-structure definition).
  • If the second and third fields of a structure definition are omitted, the structure is taken as an Identity. ID is the VCODE for an Identity matrix.

    An extended factor analytic variance structure requires first that the xfa() term be used in the model. Assuming 1 factor, it can be written as
     xfa(site,1).variety 2       # Model term, number of components
     xfa(site,1) 0 XFA1  !GP
     50*3                        # Initial specific variances
     50*.5                       # Initial loadings
     variety                     # variety 0 ID
    
    where xfa(site,1).variety is the model term to which the structure applies and 2 is the number of components in the direct product.

    The order the components are defined must agree with the order of the effects, xfa(site,1) then variety.
  • The extended factor analytic model requires an extra column in the design
  • for the factor and this is what the xfa(.,1) model function achieves.
  • XFA1 is the VCODE for an extended factor analytic variance matrix. It requires 100 initial values: 50 specific variances and 50 loadings. ASReml cannot guess suitable values so a simpler model would normally be fitted first. The values of 3 for the specific variance and 0.5 for the loading correspond to a variety variance of 3.25 and a correlation between sites of 0.25/3.25 (=.08).
  • If the second and third fields of a structure definition are omitted, the structure is taken as an Identity. ID is the VCODE for an Identity matrix.

    See Also

    Return to start