First of all, thank you very much for your work in Hail. I’ve only started using it recently and I can already see how much potential it has!
I also recently started using GenomicsDB and saw that there’s an integration in Hail’s master branch. I tried testing it with your sample data (sample2loader.json, sample2callsets.json, etc), and if I use the workspace you provide (tdbworkspace), everything works fine. However, If I create a workspace and load your sample information with my local GenomicsDB installation, it doesn’t. The commands I use are:
./create_tiledb_workspace <workspace_path>
./vcf2tiledb <loader_path>
And I already tried running them with both versions 0.6.4 and 0.8.1. The only things I edit from your loader, vid and callsets files are the paths, and the errors I get are:
- In version 0.6.4:
java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:156)
at is.hail.utils.ArrayStack$mcI$sp.top$mcI$sp(ArrayStack.scala:23)
at is.hail.annotations.RegionValueBuilder.setMissing(RegionValueBuilder.scala:172)
at is.hail.io.vcf.HtsjdkRecordReader.readVariantInfo(HtsjdkRecordReader.scala:35)
at is.hail.io.vcf.HtsjdkRecordReader.readRecord(HtsjdkRecordReader.scala:67)
at is.hail.io.vcf.LoadGDB$$anonfun$3.apply(LoadGDB.scala:182)
at is.hail.io.vcf.LoadGDB$$anonfun$3.apply(LoadGDB.scala:178)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at scala.collection.AbstractIterator.to(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
at is.hail.io.vcf.LoadGDB$.apply(LoadGDB.scala:184)
… 71 elided
- In version 0.8.1:
terminate called after throwing an instance of ‘std::length_error’ what(): basic_string::resize’.
When I replace my file (generated with my local GenomicsDB installation) tdbsorkspace/sample2Array/__array_schema.tdb with your file, the errors dissappear. Am I missing something?
Thanks,
Cristina.