How does hail generates the sample call rate?

Hi there!

I was wondering how does hail generate the sample call rate? Are there specific criteria for defining the missing data for the SNPs?

Thank you!
Maria

Assuming you’re using sample_qc, it’s the number of non-missing genotypes (where hl.is_defined(mt.GT) is true) divided by the number of rows in the dataset (mt.count_rows()). Missingness is defined simply as things that are missing. No special logic is used, nothing about allele frequencies, etc. If you load data from a file, the missing genotypes are exactly the ones explicitly stated as missing in the file.

Good to know! Thank you very much for your prompt response.