Linkage groups and 2-Points estimations

In CarthaGene, all the two points information (LODs, distances, recombination ratios...) is computed at load time. So we can directly try to group the markers in linkage groups. This is done using the group command that specifies a distance and a LOD threshold: two markers whose 2 point distance (Haldane/Ray) is below the given distance threshold and whose 2 points LOD is above the LOD threshold will be put in the same linkage group. Users that don't want to use one of the two thresholds can simply set the threshold they want to ignore to an extreme value (distance threshold set to zero, LOD threshold set to an arbitrary large number).

Let's try this with a 30cM distance threshold and a $3.0$ LOD threshold:

CG> group  0.3 3.0

Linkage Groups :
---------------:
LOD threshold=3.00
Distance threshold=30.00:

 Group ID : Marker ID List ...
        1 : 275
        2 : 197
        3 : 139
        4 : 128 281 259 178 146
        5 : 86 265 243 192 116
        6 : 54 307 304 295 289 201 177 150 102 96 67
        7 : 47 236 48
        8 : 30 249 248 234 223 210 173 80 172 287 167 120
        9 : 21 303 282 279 242 202 200 185 61 170 26 228 226 224 196 183 16...
       10 : 20 284 277 255 220 239 186 132 99 94 75 85 62 42 38
       11 : 18 306 302 305 272 166 131 187 269 271 250 237 144 176 133 90 1...
       12 : 17 266 260 257 254 231 247 114 207 225 212 199 188 122 55 110 1...
       13 : 15 298 103 27 273 211 208 159 157 115 45 205 164 112
       14 : 13 296 276 292 267 263 230 209 153 141 113 100 22 182 299 175 2...
       15 : 10 204 245 191 190 184 179 240 221 154 127 123 155 92 91 89 57 ...
       16 : 8 41
       17 : 7 138 258 219 34 286 280 203 181 149 107 109 74 37 28 29 63
       18 : 6 268 308 222 291 235 156 105 39 52 32 104 246 194 158 151 126 ...
       19 : 5 262 216 147 117 97 88 12 218 293 300 274 232 171 270 229 206 ...
       20 : 4 283 142 56 129 189 134 87 23 244 238 68
       21 : 3 72 290 251 193 121 84 40 162 137 71 16 50 233 174
       22 : 2 294 297 256 241 143 140 217 252 227 119 136 70 14 11 64
       23 : 1 301 264 165 125 124 35 278 214 213 9 111 66 59 198
23
In this case, we get 23 groups. In the sequel we will work on the group 10. To focus on this specific group, we will get the list of the markers in the group using the groupget command and then select them using the mrkselset command. This can be achieved easyly in the graphical interface with a few clicks. If you use the shell, do as follows:
CG> groupget 10
20 284 277 255 220 239 186 132 99 94 75 85 62 42 38
To automagically select the markers of the group, one can use the following syntax (using $[$ and $]$) that simply replaces what appears between $[$ and $]$ with the result of the command in between (this is called a macro-expansion). We can therefore select the group 10 with the following command:
CG> mrkselset [groupget 10]
Note that the current marker selection is not only a set of markers but also a default markers ordering. We can look at the LOD matrix beetween each pair of markers using the mrklod2p command.
CG> mrklod2p

             20   284   277   255   220   239   186   132    99    94    75...
           L029  A079  A059  A036  M232  D022  M237  M030  M076  M034  T018...
          -----------------------------------------------------------------...
    L029 |------  4.4   5.5   1.1   4.4   6.3   2.0   4.0   2.0   1.1   3.6...
    A079 |  4.4 ------ 18.4   7.4  16.5   5.7  10.5  21.7  10.5   7.4  14.0...
    A059 |  5.5  18.4 ------  6.2  14.2   6.4   9.0  17.8   9.0   6.2  11.9...
    A036 |  1.1   7.4   6.2 ------  9.0   2.6  13.0   7.7  13.0  21.4   8.6...
    M232 |  4.4  16.5  14.2   9.0 ------  4.8  13.0  16.0  13.0   9.0  17.8...
    D022 |  6.3   5.7   6.4   2.6   4.8 ------  3.2   5.2   3.2   2.6   4.5...
    M237 |  2.0  10.5   9.0  13.0  13.0   3.2 ------ 11.0  19.9  13.0  12.8...
    M030 |  4.0  21.7  17.8   7.7  16.0   5.2  11.0 ------ 11.0   7.7  13.5...
    M076 |  2.0  10.5   9.0  13.0  13.0   3.2  19.9  11.0 ------ 13.0  12.8...
    M034 |  1.1   7.4   6.2  21.4   9.0   2.6  13.0   7.7  13.0 ------  8.6...
    T018 |  3.6  14.0  11.9   8.6  17.8   4.5  12.8  13.5  12.8   8.6 -----...
    T035 | 13.6   7.2   8.8   2.6   7.4   9.6   4.0   6.8   4.0   2.6   6.5...
    L078 | 13.9   6.9   8.4   2.5   7.1   9.8   3.9   6.5   3.9   2.5   6.2...
    L001 |  5.9  16.8  19.9   6.6  15.1   6.4   9.6  16.3   9.6   6.6  12.8...
    L010 | 18.1   4.3   5.5   1.3   4.3   5.4   2.2   3.9   2.2   1.3   3.5...
Two-points LOD are also used in the mrkdouble command which detects pairs of markers that have compatible genotypes on each individuals. Such pairs of markers should be merged in one marker to simplify the search for an optimal map. See the 1.8 section on this topic. In practice, two markers can be compatible on all individuals and nevertheless be unlinked. In this case they should not be merged together.
CG> mrkdouble

Possible double markers:

               L029 = L010            [18.1]
               A079 = M030            [21.7]
               A036 = M034            [21.4]
               M237 = M076            [19.9]
               T035 = L078            [21.4]
You see that 5 pairs of markers are ``double markers''. We will ignore this issue for now and work on the whole group.

We can also have a look to the 2-points distance matrix using the mrkdist2p command. Haldane (h) or Kosambi (k) can be used. Ray distance is automatically selected for radiated hybrid data (whether you specify h or k).

CG> mrkdist2p h

Print two points distance matrices of the loci selection :
---------------------------------------------------------:

Data Set Number  1 :
                  L029  A079  A059  A036  M232  D022  M237  M030  M076  M03...
                 ----------------------------------------------------------...
           L029 |------ 29.9  25.3  61.8  28.2  10.0  43.4  31.1  43.4  61....
           A079 | 29.9 ------  2.2  17.9   3.4  13.2  10.0   0.0  10.0  17....
           A059 | 25.3   2.2 ------ 21.6   5.9  10.6  13.0   2.3  13.0  21....
           A036 | 61.8  17.9  21.6 ------ 13.0  29.8   5.9  16.6   5.9   0....
           M232 | 28.2   3.4   5.9  13.0 ------ 15.9   5.9   3.5   5.9  13....
           D022 | 10.0  13.2  10.6  29.8  15.9 ------ 22.5  13.9  22.5  29....
           M237 | 43.4  10.0  13.0   5.9   5.9  22.5 ------  8.8   0.0   5....
           M030 | 31.1   0.0   2.3  16.6   3.5  13.9   8.8 ------  8.8  16....
           M076 | 43.4  10.0  13.0   5.9   5.9  22.5   0.0   8.8 ------  5....
           M034 | 61.8  17.9  21.6   0.0  13.0  29.8   5.9  16.6   5.9 ----...
           T018 | 30.5   4.9   7.6  12.2   1.2  16.0   4.9   5.0   4.9  12....
           T035 |  6.0  18.4  14.8  40.0  16.7   2.4  27.4  18.9  27.4  40....
           L078 |  5.9  19.6  16.1  41.5  18.0   2.4  28.9  20.3  28.9  41....
           L001 | 23.4   3.4   1.1  19.8   4.6  10.4  11.5   3.5  11.5  19....
           L010 |  0.0  26.3  21.4  51.1  24.3   6.5  36.8  27.4  36.8  51....
You can also look to 2-points recombination ratio (breakage ratio for RH data) using mrkfr2p:
CG> mrkfr2p

Print two points recombination fractions  matrices of the loci selection :
---------------------------------------------------------------------------:

        L029  A079  A059  A036  M232  D022  M237  M030  M076  M034  T018  T...
       --------------------------------------------------------------------...
 L029 |------  0.2   0.2   0.4   0.2   0.1   0.3   0.2   0.3   0.4   0.2   ...
 A079 |  0.2 ------  0.0   0.2   0.0   0.1   0.1   0.0   0.1   0.2   0.0   ...
 A059 |  0.2   0.0 ------  0.2   0.1   0.1   0.1   0.0   0.1   0.2   0.1   ...
 A036 |  0.4   0.2   0.2 ------  0.1   0.2   0.1   0.1   0.1   0.0   0.1   ...
 M232 |  0.2   0.0   0.1   0.1 ------  0.1   0.1   0.0   0.1   0.1   0.0   ...
 D022 |  0.1   0.1   0.1   0.2   0.1 ------  0.2   0.1   0.2   0.2   0.1   ...
 M237 |  0.3   0.1   0.1   0.1   0.1   0.2 ------  0.1   0.0   0.1   0.0   ...
 M030 |  0.2   0.0   0.0   0.1   0.0   0.1   0.1 ------  0.1   0.1   0.0   ...
 M076 |  0.3   0.1   0.1   0.1   0.1   0.2   0.0   0.1 ------  0.1   0.0   ...
 M034 |  0.4   0.2   0.2   0.0   0.1   0.2   0.1   0.1   0.1 ------  0.1   ...
 T018 |  0.2   0.0   0.1   0.1   0.0   0.1   0.0   0.0   0.0   0.1 ------  ...
 T035 |  0.1   0.2   0.1   0.3   0.1   0.0   0.2   0.2   0.2   0.3   0.2 --...
 L078 |  0.1   0.2   0.1   0.3   0.2   0.0   0.2   0.2   0.2   0.3   0.2   ...
 L001 |  0.2   0.0   0.0   0.2   0.0   0.1   0.1   0.0   0.1   0.2   0.1   ...
 L010 |  0.0   0.2   0.2   0.3   0.2   0.1   0.3   0.2   0.3   0.3   0.2   ...
Thomas Schiex 2009-10-27