Derived factor model terms

!FAMILY

!FAMILY is similar to !GROUP in that it generates a model term t derived by grouping levels of another term. In this case, the number of levels is large, t has the form fam(a[,c]), a is an existing factor, and the recoding is taken from column/field c of file f. The default for c is 1. For example
 !FAMILY fam(Clone) Family.txt

!GROUP

The !GROUP qualifier, like !SUBSET, must appear on a line by itself after the data line and before the model line. Its purpose is to define a factor t by merging levels of an existing factor v. The syntax is
!GROUP Groupfctor Exist fctor new codes
For example
  !GROUP Year YearLoc 1 1 1 2 2 3 3 3 4 4
forms a new factor Year with 4 levels from the existing factor YearLoc with 10 levels.

Notice that the new form, t cannot be specified in a predict statement. It is the original form v which must be either predicted or averaged, even if it does not formally appear in the model. For default averaging in prediction, the weights for the levels of the grouped factor ( Year) will be (in this example 0.3 0.2 0.3 0.2) derived from the weights for the base factor ( YearLoc). Use !AVE YearLoc { 2 2 2 3 3 2 2 2 3 3 }/24 to produce equal weighting of Year effects. mapping of one to the other will usually lead to prediction problems.

!SUBSET

This qualifier provides a convenient way to define a new version of a factor with a subset of the levels of an existing factor.
     !SUBSET name factor subset
definitions occur as separate lines between the datafile line and the model line.
     name is the name of the model term being defined.
     factor is the name of an existing factor.
     subset is the list of factor levels to include in the new factors.

Example


 !SUBSET  EnvC  Env  3 5 8 9 :15 21 33
defines model term EnvC which is a factor of 12 (since there are 12 elements in the list), being a reduced form of the factor Env just selecting the environments listed. It might be used in the model in an interaction to fit say column effects for the nominated environments. The intention is to simplify the model specification in MET (Multi Environment Trials). Missing values are transmitted as missing and records whose level is zero are transmitted as zero.

Return to start