Problem loading multiple csv files for annotation on DNAnexus

Thank you @danking ! Everything is working today (!). I don’t know why, but the error using hl.import_table() is no longer reproducing when I quote='"'. I could have sworn it was yesterday… FWIW I am using a new cluster now, which sometimes fixes things.

I was able to import all the files very quickly with the glob pattern you suggested, too!

For pandas import, it actually works now too!

Table output:
image

Annotated matrixtable output:
image

And a checkpoint() command works too.

No idea what the reason is… new cluster, perhaps.

One additional question, though, is I’m not sure how to annotate the matrixtable after importing all the csv files into a single ht with hl.import_table(). I wanted to say ht.group_by(), but I don’t think I want to use an aggregator. I think a series of `ht.filter()’ commands could work, but this seems like it would require ~1500 passes (for ~1500 phenotypes) through the file. Any other suggestions?

Best,
Jeremy