which is the fastest way to filter variants in gvcf with Hail?.
I am interested now in one row per sample so I use the entries method to convert to a Table with one sample per row. But I need to filter out the reference positions and the genotypes that are not variants.
I’m slightly confused by your question. gVCFs normally only have a single sample, so I’m not sure what you mean by “I am interested in one sample per row”.
That said, I think you want
In my case I have gvcfs with thousand of samples… are the result of a merge of individual gvcfs.
Yes, finally I use the filter of the table with the GT.is_hom_ref
We usually call these “project VCFs”. Can you describe in a little more detail what you find out as the final product?