Im using hail V2 0.2.10-a4870bf102a8 , Spark 2.3.0
Running on Apache Spark version 2.3.0.cloudera4 Welcome to __ __ <>__ / /_/ /__ __/ / / __ / _ `/ / / /_/ /_/\_,_/_/_/ version 0.2.10-a4870bf102a8
What I am struggling with is trying to run Spark SQL on data from hail which I have converted to spark e.g.
sqc = SQLContext(sc) final_vds=hl.import_vcf(vcf) df=final_vds.rows().to_spark(flatten=True) df.createOrReplaceTempView("mytable") dss=sqc.sql("select * from mytable")
This does not seem to work as it complains that the view/table is not available. I can do this with an ordinary spark DataFrame which I produce from a spark ingest workflow, but not from Hail.
Should I be saving this back to HDFS then reading it back in as a spark DF and then doing SQL or is there a better way?