Using the image from Docker I’ve got this error dispute the bgen file for sure has data. Can anyone help?
Singularity> python
Python 3.6.5 (default, Apr 1 2018, 05:46:30)
[GCC 7.3.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
import hail as hl
bed = hl.import_bed(“ukb_imp_chr1_v3_pruned.bed”)
Initializing Spark and Hail with default parameters…
Spark env exported
Using Spark’s default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to “WARN”.
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Running on Apache Spark version 2.2.2
SparkUI available at http://172.16.139.8:4040
Welcome to
__ __ <>__
/ // /__ __/ /
/ __ / _ `/ / /
// //_,/// version devel-3918a9e47b23
NOTE: This is a beta version. Interfaces may change
during the beta period. We recommend pulling
the latest changes weekly.
Traceback (most recent call last):
File “”, line 1, in
File “”, line 2, in import_bed
File “/opt/hail.zip/hail/typecheck/check.py”, line 546, in wrapper
File “/opt/hail.zip/hail/methods/impex.py”, line 651, in import_bed
File “”, line 2, in import_table
File “/opt/hail.zip/hail/typecheck/check.py”, line 546, in wrapper
File “/opt/hail.zip/hail/methods/impex.py”, line 1253, in import_table
File “/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py”, line 1257, in call
File “/opt/hail.zip/hail/utils/java.py”, line 210, in deco
hail.utils.java.FatalError: MalformedInputException: Input length = 1
Java stack trace:
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:72)
at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:836)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:461)
at scala.collection.Iterator$class.isEmpty(Iterator.scala:330)
at scala.collection.AbstractIterator.isEmpty(Iterator.scala:1336)
at is.hail.utils.TextTableReader$$anonfun$11.apply(TextTableReader.scala:191)
at is.hail.utils.TextTableReader$$anonfun$11.apply(TextTableReader.scala:188)
at is.hail.utils.richUtils.RichHadoopConfiguration$$anonfun$readLines$extension$1.apply(RichHadoopConfiguration.scala:287)
at is.hail.utils.richUtils.RichHadoopConfiguration$$anonfun$readLines$extension$1.apply(RichHadoopConfiguration.scala:278)
at is.hail.utils.package$.using(package.scala:587)
at is.hail.utils.richUtils.RichHadoopConfiguration$.readFile$extension(RichHadoopConfiguration.scala:271)
at is.hail.utils.richUtils.RichHadoopConfiguration$.readLines$extension(RichHadoopConfiguration.scala:278)
at is.hail.utils.TextTableReader$.read(TextTableReader.scala:188)
at is.hail.HailContext$$anonfun$importTables$3.apply(HailContext.scala:522)
at is.hail.HailContext$$anonfun$importTables$3.apply(HailContext.scala:524)
at is.hail.HailContext.maybeGZipAsBGZip(HailContext.scala:623)
at is.hail.HailContext.importTables(HailContext.scala:521)
at is.hail.HailContext.importTable(HailContext.scala:483)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Sure I have updated to the current version now using the Dockerfile that you provided but the same error persist. .bed is a binary file so I’m not sure if it make sense to open it directly?
Singularity> python
Python 3.6.9 (default, Nov 23 2019, 07:02:27)
[GCC 6.3.0 20170516] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
import hail as hl
bed = hl.import_bed(“/scratch/zhupy/ukb_imp_chr1_v3_pruned.bed”)
Initializing Hail with default parameters…
2020-12-01 21:44:48 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Setting default log level to “WARN”.
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2020-12-01 21:44:50 WARN Hail:37 - This Hail JAR was compiled for Spark 2.4.5, running with Spark 2.4.1.
Compatibility is not guaranteed.
Running on Apache Spark version 2.4.1
SparkUI available at http://cdr860.int.cedar.computecanada.ca:4040
Welcome to
__ __ <>__
/ // /__ __/ /
/ __ / _ `/ / /
// //_,/// version 0.2.60-de1845e1c2f6
LOGGING: writing to /scratch/zhupy/hail-20201201-1344-0.2.60-de1845e1c2f6.log
Traceback (most recent call last):
File “”, line 1, in
File “”, line 2, in import_bed
File “/usr/local/lib/python3.6/site-packages/hail/typecheck/check.py”, line 614, in wrapper
return original_func(*args, **kwargs)
File “/usr/local/lib/python3.6/site-packages/hail/methods/impex.py”, line 796, in import_bed
**kwargs)
File “”, line 2, in import_table
File “/usr/local/lib/python3.6/site-packages/hail/typecheck/check.py”, line 614, in wrapper
return original_func(*args, **kwargs)
File “/usr/local/lib/python3.6/site-packages/hail/methods/impex.py”, line 1528, in import_table
ht = Table(ir.TableRead(tr))
File “/usr/local/lib/python3.6/site-packages/hail/table.py”, line 343, in init
self._type = self._tir.typ
File “/usr/local/lib/python3.6/site-packages/hail/ir/base_ir.py”, line 339, in typ
self._compute_type()
File “/usr/local/lib/python3.6/site-packages/hail/ir/table_ir.py”, line 250, in _compute_type
self._type = Env.backend().table_type(self)
File “/usr/local/lib/python3.6/site-packages/hail/backend/spark_backend.py”, line 277, in table_type
jir = self._to_java_table_ir(tir)
File “/usr/local/lib/python3.6/site-packages/hail/backend/spark_backend.py”, line 264, in _to_java_table_ir
return self._to_java_ir(ir, self._parse_table_ir)
File “/usr/local/lib/python3.6/site-packages/hail/backend/spark_backend.py”, line 257, in _to_java_ir
ir._jir = parse(r(ir), ir_map=r.jirs)
File “/usr/local/lib/python3.6/site-packages/hail/backend/spark_backend.py”, line 232, in _parse_table_ir
return self._jbackend.parse_table_ir(code, ref_map, ir_map)
File “/usr/local/lib/python3.6/site-packages/py4j/java_gateway.py”, line 1257, in call
answer, self.gateway_client, self.target_id, self.name)
File “/usr/local/lib/python3.6/site-packages/hail/backend/py4j_backend.py”, line 32, in deco
‘Error summary: %s’ % (deepest, full, hail.version, deepest), error_id) from None
hail.utils.java.FatalError: MalformedInputException: Input length = 1
Java stack trace:
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:72)
at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:834)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462)
at scala.collection.Iterator$class.isEmpty(Iterator.scala:331)
at scala.collection.AbstractIterator.isEmpty(Iterator.scala:1334)
at is.hail.expr.ir.TextTableReader$$anonfun$14.apply(TextTableReader.scala:271)
at is.hail.expr.ir.TextTableReader$$anonfun$14.apply(TextTableReader.scala:268)
at is.hail.io.fs.FS$$anonfun$readLines$1.apply(FS.scala:218)
at is.hail.io.fs.FS$$anonfun$readLines$1.apply(FS.scala:209)
at is.hail.utils.package$.using(package.scala:618)
at is.hail.io.fs.FS$class.readLines(FS.scala:208)
at is.hail.io.fs.HadoopFS.readLines(HadoopFS.scala:70)
at is.hail.expr.ir.TextTableReader$.readMetadata(TextTableReader.scala:268)
at is.hail.expr.ir.TextTableReader$.apply(TextTableReader.scala:306)
at is.hail.expr.ir.TextTableReader$.fromJValue(TextTableReader.scala:313)
at is.hail.expr.ir.TableReader$.fromJValue(TableIR.scala:103)
at is.hail.expr.ir.IRParser$.table_ir_1(Parser.scala:1462)
at is.hail.expr.ir.IRParser$$anonfun$table_ir$1.apply(Parser.scala:1438)
at is.hail.expr.ir.IRParser$$anonfun$table_ir$1.apply(Parser.scala:1438)
at is.hail.utils.StackSafe$More.advance(StackSafe.scala:64)
at is.hail.utils.StackSafe$.run(StackSafe.scala:16)
at is.hail.utils.StackSafe$StackFrame.run(StackSafe.scala:32)
at is.hail.expr.ir.IRParser$$anonfun$parse_table_ir$1.apply(Parser.scala:1957)
at is.hail.expr.ir.IRParser$$anonfun$parse_table_ir$1.apply(Parser.scala:1957)
at is.hail.expr.ir.IRParser$.parse(Parser.scala:1946)
at is.hail.expr.ir.IRParser$.parse_table_ir(Parser.scala:1957)
at is.hail.backend.spark.SparkBackend$$anonfun$parse_table_ir$1$$anonfun$apply$20.apply(SparkBackend.scala:596)
at is.hail.backend.spark.SparkBackend$$anonfun$parse_table_ir$1$$anonfun$apply$20.apply(SparkBackend.scala:595)
at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:25)
at is.hail.expr.ir.ExecuteContext$$anonfun$scoped$1.apply(ExecuteContext.scala:23)
at is.hail.utils.package$.using(package.scala:618)
at is.hail.annotations.Region$.scoped(Region.scala:18)
at is.hail.expr.ir.ExecuteContext$.scoped(ExecuteContext.scala:23)
at is.hail.backend.spark.SparkBackend.withExecuteContext(SparkBackend.scala:247)
at is.hail.backend.spark.SparkBackend$$anonfun$parse_table_ir$1.apply(SparkBackend.scala:595)
at is.hail.backend.spark.SparkBackend$$anonfun$parse_table_ir$1.apply(SparkBackend.scala:594)
at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:52)
at is.hail.utils.ExecutionTimer$.logTime(ExecutionTimer.scala:59)
at is.hail.backend.spark.SparkBackend.parse_table_ir(SparkBackend.scala:594)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
I think you’re confused between the UCSC bed file format and the PLINK file format triplet (bed, bim, fam). import_bed reads UCSC bed files. import_plink reads a triplet of PLINK files.