is there any method to convert a matrixtable to parquet?. I was looking in the API and I only see the methods to convert to Table, rows, cols and entries, then to_spark and then write to Parquet. But, there is any method to convert all to Parquet?.
No, going through Spark is the only way to write Parquet.
and the method localize_entries what exactly do?
localize_entries converts a matrix table to a table where the entries are represented as an array of structs per row, and the column values are an array of structs in the table globals.
ok thank you. For storage of big files, more than 150Gb, and for performance what is your recommendation?. I want to try with VDS, Avro and Parquet.
Hail has its own format. It is created by the
write method on
MatrixTable. In Hail 0.1 we used to call these things Variant Datasets / VDS. Now they’re more general purpose, so they’re just called
MatrixTable (usually with extensions
.mt). These internal hail formats should give the best performance.