Variance Models

Factor Analysis models

FAk, FACVk and XFAk are different parameterizations of the factor analytic model in which S is modelled as S= GG' + P where G$ is a matrix of k loadings on the covariance scale and P is a diagonal vector of specific variances. See Smith et al. (2001) and Thompson et al. (2003) for examples of factor analytic models in multi-environment trials.

The general limitations are
  • that P may not include zeros except in the XFAk formulation
  • constraints are required in G for kgt 1 for identifiability. Typically, one zero is placed in the second column, two zeros in the third column, etc.
  • The total number of parameters fitted (kw + w - k(k-1)/2) may not exceed w(w+1)/2.

    Correlation form


    FAk models the variance-covariance matrix $S on the correlation scale as S= DCD, where
  • D is diagonal such that DD = diag(S),
  • C is a correlation matrix of the form FF' + E where F is a matrix of k loadings vectors on the correlation scale and E is diagonal and is defined by difference,
  • the parameters are specified in the order: loadings for each factor (F) followed by the variances (diag(S); when k is greater than 1, constraints on the elements of F are required.

    Covariance scale

    FACVk models ( CV for covariance) are an alternative formulation of FA models in which S is modelled as S= GG' + P where G is a matrix of k loadings on the covariance scale and P is diagonal. The parameters in FACV
  • are specified in the order: loadings (G) followed by specific variances P; when k is greater than 1, constraints on the elements of G are required,
  • are related to those in FA by G= DF and P= DED,

    Extended form


    XFAk ( X for extended) is the third form of the factor analytic model and has the same parameterisation as for FACV, that is, S= GG' + P. However, XFA models
  • have parameters specified in the order diag(P) and vec(G); when k is greater than 1, constraints on the elements of G are required,
  • may not be used in R structures,
  • are used in G structures in combination with the xfa(f,k) model term,
  • return the factors as well as the effects.
  • permit some elements of P to be fixed to zero,
  • are computationally faster than the FACV formulation for large problems when k is much smaller than w,

    Special consideration is required when using the XFAk model. The SSP must be expanded to have room to hold the k factors. This is achieved by using the xfa(f,k) model term in place of f in the model. For example,

      y ~ site !r geno.xfa(site,2)
     0 0 1
     geno.xfa(site,2) 2
     geno
     xfa(site,2) 0 XFA2  !GP
     10*0.1   # Psi (Specific variances, assuming 10 sites)
     10*0.3   # First loadings
     10*0.1   # Second loadings
    

    In ASReml 3 if no loadings are fixed (i.e. !GP), ASReml will rotate the loadings to orthogonality, and hold the leading loadings of lower factors fixed. They are however updated in the orthogonalization process which occurs at the beginning of each iteration (so the final returned values have not been formally rotated).

    Finding the REML solutions for multifactor Factor Analytic models can be difficult. The first problem is specifying initial values. When using !CONTINUE and progressing XFA(k) to XFA(k+1), ASReml3 initialises the next factor at SQRT(P*0.4) and changing the sign of the (relatively) largest loading to negative.

    One strategy which sometimes works in this context is to hold the previously estimated factor loadings fixed for one round of iterations so that the next factor aims at explaining variation previously incorporated in Psi. Then allow all loadings to be updated for next round. A second problem, at present unresolved, is that sometimes the LogL rises to a relatively high value and then drifts away. In an attempt to make the process easier, these two processes have been linked as an additional meaning for the !AILOADING qualifier. For the first !AILOADING iterations, the loading coefficients for all but the last factor are held fixed. After that, loadings are rotated to orthogonal and updated. If !AILOADING is not set by the user and the model is an upgrade from a lower order XFA, !AILOADING is set to 4. is the coding for a large job tying to estimate factors.
     !WORK 1 !NOGRAPH !continue
     Title: ALBUS2tage.
     #trial,year,region,variety,yield,rep,weight,ems
     #KFA02BURU,2002,NSW,KIEV-MUTANT,0.873,3,2136.562,0.0010000
      trial   !A
      year    !I
      region  !A
      variety !A
      yield
      rep     *
      weight  !*0.025
      ems
     !CYCLE 11 1 2 3 4
     !DOPART $I
     ALBUS2tage.csv  !SKIP 1   !MAXIT 40 !AILOAD 20
    
     !PART 11
      !MAXIT 25
      yield !wt=weight ~ mu trial !r  trial.variety
     1 1 1
     0 !S2==0.025
     trial.variety 2
     trial 0 CORUH .1
     87*.1
     variety
    
     !PART 1 2 3 4
      yield !wt=weight ~ mu trial !r xfa(trial,$I).var
     1 1 1
     0 !S2==0.025
     xfa(trial,$I).var 2
     xfa(trial 0 XFA$I     !GP
     87*.01
     87*.07  87*.07   87*.07  87*.07
     variety
    
    A previous set of analyses using these five models gave LogL values for the models CORUH, XFA1, XFA2, XFA3 and XFA4 respectively of 2782, 2910, 3021, 3109 and 3200 using the strategies listed above in separate runs. Running this job using the integrated strategy produced LogL values of 2783, 2911, 3048, 3153 and 3206. However, for models XFA3 and XFA4, the LogL drifted away again.

    The XFA display reported in the .res file has been revised. The current output from a small example with 9 environments and 2 factors is %Ontario
     DISPLAY of variance partitioning for XFA structure in xfa(Env,2).Geno
     Lvl |----+----+----+----+----+----+----+----+----+----| TotalVar %expl PsiVar Loadings
       1 |                                       1         |   0.3339  79.7 0.0679 0.5147 0.0335
       2 |                                               1 2   0.1666 100.0 0.0000 0.4003 0.0797
       3 |                            1    2               |   0.2475  67.8 0.0798 0.3805 0.1514
       4 |                                            1    2   0.1475 100.0 0.0000 0.3625 0.1269
       5 |                                        1        2   0.4496 100.0 0.0000 0.6104 -0.278
       6 |                     1                           2   0.1210 100.0 0.0000 0.2287 0.2622
       7 |                    1     2                      |   0.4106  54.4 0.1872 0.4152 -0.226
       8 |    1                                            2   0.0901 100.0 0.0000 0.0922 0.2857
       9 |                           1                     2   0.1422 100.0 0.0000 0.2819 0.2506
       0 |----+----+----+----+----+----+----+----+-- Average   0.2343  89.1 0.0372 0.3651 0.0763
    
    In the figure, 1 indicates the proportion of TotalVar explained by the first loading, 2 indicates the proportion explained by first and second (provided it plots right of 1. Consequently, the distance from 2 to the right margin represents PsiVar. %expl reports the percentage of TotalVar explained by all loadings. The last row contains column averages.

    See Also

    Return to start