Reproduce hl.pc_project output using Plink

Hi!

I am trying to reproduce Hail’s pc_project output using plink –score. I am adding the variance-standardize option and keeping mean imputation. Plink notes that “PCs will be scaled a bit differently from ref_data.eigenvec; you need to multiply or divide the PCs by a multiple of sqrt(eigenvalue) to put them on the same scale”, how does Hail scale the PCs? is there a specific scaling equation that I can use to make my plink output similar to Hail’s?

Thank you

Hi @Hatoon,

Apologies for the very late reply. Is this still something you need help with?

Hi,

I found the code published by Zhou et al. (doi: 10.1016/j.xgen.2022.100192) very helpful.
I was able to project my dataset using PLINK --sscore on loadings generated by Hail, using the following options:

$plink2 \\ 
--bfile $test \\ 
--read-freq $freq \\
--score $loadings \\ 
variance-standardize \\ 
cols=-scoreavgs,+scoresums \\ 
list-variants \\ 
header-read \\ 
--score-col-nums 3-${pc_col} \\ 
--out ${output_test}

Then I scaled the projection output sscore by dividing it by the square root of sscore.vars. I was wondering whether this scaling step is performed internally in Hail’s pc_project.
Detailed documentation on how to replicate Hail’s outputs using PLINK would be helpful for the future, as many users cannot install Hail on their local servers.

Thank you,