Reproduce hl.pc_project output using Plink

Hatoon · August 8, 2025, 1:00am

Hi!

I am trying to reproduce Hail’s pc_project output using plink –score. I am adding the variance-standardize option and keeping mean imputation. Plink notes that “PCs will be scaled a bit differently from ref_data.eigenvec; you need to multiply or divide the PCs by a multiple of sqrt(eigenvalue) to put them on the same scale”, how does Hail scale the PCs? is there a specific scaling equation that I can use to make my plink output similar to Hail’s?

Thank you

patrick-schultz · October 29, 2025, 7:50pm

Hi @Hatoon,

Apologies for the very late reply. Is this still something you need help with?

Hatoon · October 31, 2025, 7:49pm

Hi,

I found the code published by Zhou et al. (doi: 10.1016/j.xgen.2022.100192) very helpful.
I was able to project my dataset using PLINK --sscore on loadings generated by Hail, using the following options:

$plink2 \\ 
--bfile $test \\ 
--read-freq $freq \\
--score $loadings \\ 
variance-standardize \\ 
cols=-scoreavgs,+scoresums \\ 
list-variants \\ 
header-read \\ 
--score-col-nums 3-${pc_col} \\ 
--out ${output_test}

Then I scaled the projection output sscore by dividing it by the square root of sscore.vars. I was wondering whether this scaling step is performed internally in Hail’s pc_project.
Detailed documentation on how to replicate Hail’s outputs using PLINK would be helpful for the future, as many users cannot install Hail on their local servers.

Thank you,

Topic		Replies	Views
Save PCs for projection Feature Requests	5	1512	May 12, 2020
PCA Projection onto existing PCA Hail Query & hailctl	5	516	September 22, 2023
Hl.experimental.pc_project and the alleles key Hail Query & hailctl	5	625	August 17, 2020
Arrangement of the scores output for hl.pca Hail Query & hailctl	4	160	March 18, 2024
Hail curious potential user Q Help [0.1]	8	1620	March 7, 2017

Reproduce hl.pc_project output using Plink

Related topics