Filter variants in gvcf

Hi,

which is the fastest way to filter variants in gvcf with Hail?.

I am interested now in one row per sample so I use the entries method to convert to a Table with one sample per row. But I need to filter out the reference positions and the genotypes that are not variants.

I’m slightly confused by your question. gVCFs normally only have a single sample, so I’m not sure what you mean by “I am interested in one sample per row”.

That said, I think you want Table.filter: https://hail.is/docs/0.2/hail.Table.html#hail.Table.filter.

In my case I have gvcfs with thousand of samples… are the result of a merge of individual gvcfs.

Yes, finally I use the filter of the table with the GT.is_hom_ref

We usually call these “project VCFs”. Can you describe in a little more detail what you find out as the final product?