RVD error! Keys found out of order


I am running hail locally and get the following error after using .union_cols() to merge two files:

HailException: RVD error! Keys found out of order:
Current key: [1:49861205,[A,AGTGT]]
Previous key: [1:49861205,[A,AGTGTGTGT]]
This error can occur after a split_multi if the dataset
contains both multiallelic variants and duplicated loci.

I have run hl.split_multi_hts() on both files but now get it when I try to run pca, as well as the following which ran fine prior to the merge:
n = mt.count()
print(‘n samples:’)
print(‘n variants:’)

I have attempted to subsequently filter out split alleles using “mt.was_split == True, keep = False” to see if that helps but no luck so far. I’m a new user of hail so any help is much appreciated. Thanks in advance!

The problem is that you have duplicate loci before using split_multi_hts. You can use the permit_shuffle=True on split_multi_hts to fix this error, at the cost of additional compute time.

I will try that. Thanks!