跳至正文
首页 » Multiple traits SBLUP (MT-SBLUP)

Multiple traits SBLUP (MT-SBLUP)

The multi-trait SBLUP is extended from single trait SBLUP model, it outperforms single trait SBLUP model on prediction accuracy for all the traits in analysis (Robert, Zhihong, et al. 2018). There are two ways to fit MT-SBLUP model, by either individual genotype data or LD correlation matrix, see more details on the chapter of SBLUP model. Here, we only show example of how to fit MT-SBLUP using a pre-computed LD correlation matrix:

./hiblup --sblup
         --sumstat demo1.ma demo2.ma demo3.ma demo4.ma   #the summary data, use space as separator
         --ldm demo_ldm      #the pre-computed LD correlation matrix
         --h2 0.3234 0.1256 0.6345 0.3536
         --rg 0.1336 0.5567 0.2345 0.8454 0.3446 0.4633
       # --pcg               #use PCG for fast computing
         --threads 10
         --out demo

Note that the number of summary data of traits in analysis is not limited. The summary data should be prepared in COJO format, as described here, and the LD correlation matrix is stored in binary file, which can be output by HIBLUP (see here). The heritability of trait and the genetic correlation among traits must be specified, these genetic parameters can be estimated from REML if the individual-level data are available or from LD score regression using the summary data. The input order of genetic correlations is the lower triangle of matrix (take the above command for example, the input order should be: 1-2 1-3 1-4 2-3 2-4 3-4).

After running successfully, a file named “demo.snpeff” will be generated in the work directory as follows:

id a1 a2 freq_a1 demo1 demo2 demo3 demo4
M1 A G 0.1285 -0.000963937 -0.000577569 -0.000792698 0.000175215
M2 T C 0.1285 -0.00108931 -0.000597102 -0.000825137 0.000177501
M3 A G 0.1062 0.00588629 0.00155157 0.00270818 0.000154987
M4 G A 0.1285 -0.00164344 -0.000557257 -0.000874613 0.000155528
M5 A C 0.2459 -0.00100206 -0.000456737 -0.000855748 -0.000422206

As shown above, the estimated SNP effects are listed in the last several columns by traits. To obtain the predicted GEBV or GPRS of individuals, we recommend using HIBLUP to implement it (see here), since we tested that it is several times faster than the ‘--score‘ function in PLINK.