Issue running Spark locally: java.net.BindException

Hi there,

I need some help please with getting Hail running locally on my Mac OS. I’ve managed to install hail successfully however when trying the tutorials in jupyter I’m getting the following error message when import hl as hl.

Py4JJavaError                             Traceback (most recent call last)
<ipython-input-3-435693b483a2> in <module>
      1 import hail as hl
----> 2 hl.init()

<decorator-gen-1713> in init(sc, app_name, master, local, log, quiet, append, min_block_size, branching_factor, tmp_dir, default_reference, idempotent, global_seed, spark_conf, skip_logging_configuration, local_tmpdir, _optimizer_iterations)

~/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py in wrapper(__original_func, *args, **kwargs)
    612     def wrapper(__original_func, *args, **kwargs):
    613         args_, kwargs_ = check_all(__original_func, args, kwargs, checkers, is_method=is_method)
--> 614         return __original_func(*args_, **kwargs_)
    615 
    616     return wrapper

~/miniconda3/envs/hail/lib/python3.7/site-packages/hail/context.py in init(sc, app_name, master, local, log, quiet, append, min_block_size, branching_factor, tmp_dir, default_reference, idempotent, global_seed, spark_conf, skip_logging_configuration, local_tmpdir, _optimizer_iterations)
    226         idempotent, sc, spark_conf, app_name, master, local, log,
    227         quiet, append, min_block_size, branching_factor, tmpdir, local_tmpdir,
--> 228         skip_logging_configuration, optimizer_iterations)
    229 
    230     HailContext(

~/miniconda3/envs/hail/lib/python3.7/site-packages/hail/backend/spark_backend.py in __init__(self, idempotent, sc, spark_conf, app_name, master, local, log, quiet, append, min_block_size, branching_factor, tmpdir, local_tmpdir, skip_logging_configuration, optimizer_iterations)
    191         else:
    192             self._jbackend = hail_package.backend.spark.SparkBackend.apply(
--> 193                 jsc, app_name, master, local, True, min_block_size, tmpdir, local_tmpdir)
    194             self._jhc = hail_package.HailContext.apply(
    195                 self._jbackend, log, True, append, branching_factor, skip_logging_configuration, optimizer_iterations)

~/miniconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py in __call__(self, *args)
   1255         answer = self.gateway_client.send_command(command)
   1256         return_value = get_return_value(
-> 1257             answer, self.gateway_client, self.target_id, self.name)
   1258 
   1259         for temp_arg in temp_args:

~/miniconda3/envs/hail/lib/python3.7/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
--> 328                     format(target_id, ".", name), value)
    329             else:
    330                 raise Py4JError(

Py4JJavaError: An error occurred while calling z:is.hail.backend.spark.SparkBackend.apply.
: java.net.BindException: Can't assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
	at sun.nio.ch.Net.bind0(Native Method)
	at sun.nio.ch.Net.bind(Net.java:433)
	at sun.nio.ch.Net.bind(Net.java:425)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
	at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)
	at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)
	at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)
	at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)
	at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)
	at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)

I figure it’s a spark issue… but I’m not smart enough to figure it out!

BW
Ellie

Hey Ellie,

Three diagnostic questions:

  1. How did you install hail? With pip?

  2. What version of hail do you have?

  3. Does this work?

import pyspark
sc = pyspark.SparkContext()

This seems to be caused by Spark not looking for the right IP on certain network connections. Fix is here:

Hey John,

Thanks for getting back to me…

  1. Yes I installed with pip (no issue)
  2. Re version, I think 0.2.43?
  3. It says command not found…

:frowning:

@lecb John meant you should paste those lines into a python interpreter. So start python: python, then copy-paste that. You’ll find that Spark also does not work. You need to follow the recommendations at this StackOverflow post referenced by the Discuss post that Tim linked.

This is the output:

>>> sc = pyspark.SparkContent()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'pyspark' has no attribute 'SparkContent' 

I have looked at the StackOverflow post and I’m not on an VPN or anything similar. I’ve checked my /etc/hosts file and my hostname is 127.0.0.1

So not sure what is going on!

Did you add a line to the hosts file as here: https://stackoverflow.com/questions/34601554/mac-spark-shell-error-initializing-sparkcontext/35852781#35852781?

Yes.

For one, I am unable to write to the /etc/hosts file (even when I try an override in vim). The hostname is already set to 127.0.0.1. Additionally, when I try this: sudo hostname -s 127.0.0.1 it doesn’t help.

Now I can’t even get jupyter to work properly, it constantly says “connecting”. This is the output in my terminal:

    HTTPServerRequest(protocol='http', host='localhost:8889', method='GET', uri='/api/kernels/8e9c319e-d0d8-412d-8e75-c30d102f764b/channels?session_id=83299bab0ec7426a984abe162cf8a880', version='HTTP/1.1', remote_ip='::1')
    Traceback (most recent call last):
      File "/Users/eseaby/miniconda3/lib/python3.7/site-packages/tornado/websocket.py", line 956, in _accept_connection
        open_result = handler.open(*handler.open_args, **handler.open_kwargs)
      File "/Users/eseaby/miniconda3/lib/python3.7/site-packages/notebook/services/kernels/handlers.py", line 274, in open
        self.create_stream()
      File "/Users/eseaby/miniconda3/lib/python3.7/site-packages/notebook/services/kernels/handlers.py", line 127, in create_stream
        meth = getattr(km, 'connect_' + channel)
    AttributeError: 'MappingKernelManager' object has no attribute 'connect_control'

Hope that helps.
E

  1. You mis-typed the code snippet, it is: sc = pyspark.SparkContext()
  2. The original error, as noted here, also affects some computers connected to WiFi.
  3. The original error suggests setting spark.driver.bindAddress. You should try setting this. It is a spark property, so you can set it using hl.init's spark_conf keyword argument. Try setting it to an IP address that Spark can actually bind to. For example, try: 0.0.0.0 or 127.0.0.1.
  4. I assume, because you cannot edit /etc/hosts, that you do not have root access to your machine. I suggest you ask whoever maintains that machine to help you get pyspark working. PySpark is a dependency of Hail. Until the two pyspark lines that John posted work, you will be unable to use Hail.

Thanks Dan.

Apologies for the typo, the output is as follows:

2020-06-02 21:27:20 WARN  SparkContext:66 - Another SparkContext is being constructed (or threw an exception in its constructor).  This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:423)
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
py4j.Gateway.invoke(Gateway.java:238)
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
py4j.GatewayConnection.run(GatewayConnection.java:238)
java.lang.Thread.run(Thread.java:748)
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 WARN  Utils:66 - Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
2020-06-02 21:27:20 ERROR SparkContext:91 - Error initializing SparkContext.
java.net.BindException: Can't assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
	at sun.nio.ch.Net.bind0(Native Method)
	at sun.nio.ch.Net.bind(Net.java:433)
	at sun.nio.ch.Net.bind(Net.java:425)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
	at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)
	at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)
	at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)
	at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)
	at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)
	at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/eseaby/miniconda3/lib/python3.7/site-packages/pyspark/context.py", line 136, in __init__
    conf, jsc, profiler_cls)
  File "/Users/eseaby/miniconda3/lib/python3.7/site-packages/pyspark/context.py", line 198, in _do_init
    self._jsc = jsc or self._initialize_context(self._conf._jconf)
  File "/Users/eseaby/miniconda3/lib/python3.7/site-packages/pyspark/context.py", line 306, in _initialize_context
    return self._jvm.JavaSparkContext(jconf)
  File "/Users/eseaby/Documents/spark-2.4.0-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1525, in __call__
  File "/Users/eseaby/Documents/spark-2.4.0-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.net.BindException: Can't assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
	at sun.nio.ch.Net.bind0(Native Method)
	at sun.nio.ch.Net.bind(Net.java:433)
	at sun.nio.ch.Net.bind(Net.java:425)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
	at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)
	at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)
	at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)
	at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)
	at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)
	at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)

I’m running a Broad machine… so I’m limited by their restrictions!

I’m afraid you’ll have to walk me through setting the hl.init’s spark conf argument. I’m a bit confused how I’m supposed to set it up when I can’t get hail to work? Please bear in mind you’re talking to a clinician not a computational software engineer.
BW
Ellie

Ps. I can get hail to work with ipython but not with jupyter which seems odd? This is output when running jupyter:

[I 21:48:12.279 NotebookApp] The port 8888 is already in use, trying another port.
[I 21:48:12.280 NotebookApp] The port 8889 is already in use, trying another port.
[I 21:48:12.281 NotebookApp] The port 8890 is already in use, trying another port.
[I 21:48:12.282 NotebookApp] The port 8891 is already in use, trying another port.
[I 21:48:12.282 NotebookApp] The port 8892 is already in use, trying another port.
[I 21:48:12.289 NotebookApp] Serving notebooks from local directory: /Users/eseaby/tutorials
[I 21:48:12.289 NotebookApp] The Jupyter Notebook is running at:
[I 21:48:12.289 NotebookApp] http://localhost:8835/?token=f80c5a6907cdef0a5a810d9f90614519d9a4677d4ed67a9f
[I 21:48:12.289 NotebookApp]  or http://127.0.0.1:8835/?token=f80c5a6907cdef0a5a810d9f90614519d9a4677d4ed67a9f
[I 21:48:12.289 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 21:48:12.304 NotebookApp] 
    
    To access the notebook, open this file in a browser:
        file:///Users/eseaby/Library/Jupyter/runtime/nbserver-19903-open.html
    Or copy and paste one of these URLs:
        http://localhost:8835/?token=f80c5a6907cdef0a5a810d9f90614519d9a4677d4ed67a9f
     or http://127.0.0.1:8835/?token=f80c5a6907cdef0a5a810d9f90614519d9a4677d4ed67a9f
[I 21:48:17.916 NotebookApp] Kernel started: 56cd8da1-1326-43f5-8390-396b845c79ab
[I 21:48:19.218 NotebookApp] Adapting from protocol version 5.1 (kernel 56cd8da1-1326-43f5-8390-396b845c79ab) to 5.3 (client).
[E 21:48:19.220 NotebookApp] Uncaught exception GET /api/kernels/56cd8da1-1326-43f5-8390-396b845c79ab/channels?session_id=40e551cc6f4447b98967838cd1f20ec6 (::1)
    HTTPServerRequest(protocol='http', host='localhost:8835', method='GET', uri='/api/kernels/56cd8da1-1326-43f5-8390-396b845c79ab/channels?session_id=40e551cc6f4447b98967838cd1f20ec6', version='HTTP/1.1', remote_ip='::1')
    Traceback (most recent call last):
      File "/Users/eseaby/miniconda3/lib/python3.7/site-packages/tornado/websocket.py", line 546, in _run_callback
        result = callback(*args, **kwargs)
      File "/Users/eseaby/miniconda3/lib/python3.7/site-packages/notebook/services/kernels/handlers.py", line 274, in open
        self.create_stream()
      File "/Users/eseaby/miniconda3/lib/python3.7/site-packages/notebook/services/kernels/handlers.py", line 127, in create_stream
        meth = getattr(km, 'connect_' + channel)
    AttributeError: 'MappingKernelManager' object has no attribute 'connect_control'
[I 21:48:19.223 NotebookApp] Starting buffering for 56cd8da1-1326-43f5-8390-396b845c79ab:40e551cc6f4447b98967838cd1f20ec6
[I 21:48:20.232 NotebookApp] Adapting from protocol version 5.1 (kernel 56cd8da1-1326-43f5-8390-396b845c79ab) to 5.3 (client).
[I 21:48:20.233 NotebookApp] Restoring connection for 56cd8da1-1326-43f5-8390-396b845c79ab:40e551cc6f4447b98967838cd1f20ec6
2020-06-02 21:48:28 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).

EDIT: I fixed an incorrect code snippet that lecb points out below.


The installation of Jupyter on that computer appears broken.

Let’s try this again from scratch and see if we can isolate exactly where the issue is.

  1. It looks like you have miniconda3 installed. Let’s create a fresh miniconda3 environment.
conda create -n hailenv python=3.7
  1. Activate that environment:
conda activate hailenv
  1. Install Hail:
pip3 install hail
  1. Test PySpark:
python3 -c 'from pyspark import SparkContext, SparkConf
sc = SparkContext()'
  1. If that raises the original error posted above, the fix is to edit /etc/hosts. AFAIK, everyone with a Broad MacBook should have administrator privileges. Try executing the following to fix /etc/hosts.
sudo /bin/sh -c 'echo "127.0.0.1\t$(hostname)" >>/etc/hosts'
  1. Now try:
python3 -c 'from pyspark import SparkContext, SparkConf
sc = SparkContext()'
    1. If that does not raise any errors go to 8. Otherwise, we can instead try specifying the bindAddress programmatically:
python3 -c 'from pyspark import SparkContext, SparkConf
conf = SparkConf().set("spark.driver.bindAddress", "0.0.0.0")
sc = SparkContext(conf=conf)'
    1. If that works try starting Hail with the same configuration. First, start a python interpreter by executing:
python3
    1. You should be greeted by a >>> . Inside the python interpreter, paste the following. If this prints a table of genotypes for a few variants and samples, then go to the next section.
import hail as hl
hl.init(conf={"spark.driver.bindAddress", "0.0.0.0"})
mt = hl.balding_nichols_model(n_populations=3, n_samples=50, n_variants=100)
mt.show()
    1. Great, try Hail directly now. Start a python interpreter by executing:
python3
    1. You should be greeted by a >>> . Inside the python interpreter, paste the following. If this prints a table of genotypes for a few variants and samples, then go to 9.
import hail as hl
mt = hl.balding_nichols_model(n_populations=3, n_samples=50, n_variants=100)
mt.show()

  1. OK, if we’re here then some portion of the above has successfully installed and configured Hail. Remember that if you had to use hl.init(conf=...), you will now need to do that every time you start Hail.
    Let’s get Jupyter working. Start by installing Jupyter from the terminal. If you’re still inside a python interpreter (with the >>> ), then type exit() and press enter or just press Control-D to exit it. Paste this:
pip3 install jupyter
  1. If that did not produce any errors, try starting Jupyter with:
jupyter notebook
  1. That should open your web browser. If it does not copy the URL containing 127.0.0.1 and paste it into your web browser. Create a new “Python 3” notebook. Do not re-use an existing notebook. Into the first cell paste the same successful Hail command you used above. Press Shift and Enter together. Now you should see an HTML table of genotypes.

Let me know if any step above doesn’t work. I’m confident we can get you a working Hail environment if we start from scratch.

And as an aside, Jupyter Notebooks are great for prototyping, but for a Hail project of any substantial size, you’ll want to switch to editing files and executing them separately. A lot of folks have good experiences with MS Visual Studio Code.

Dan, you are a wonderful man! Thank you for the considered response.

I’ve followed your steps and I’m getting stuck at step 6:


> sc = pyspark.SparkContext()'

Traceback (most recent call last):

File "<string>", line 2, in <module>

NameError: name 'pyspark' is not defined

I can’t get step 7 to work for similar reasons (pyspark not defined). Any ideas?
BW
Ellie

This is because of the way Dan wrote the imports. Just change pyspark.SparkContext to SparkContext (remove the pyspark. part).

Thanks. Still get an error… this time:

(hailenv) wm06b-5c7:~ eseaby$ python3 -c 'from pyspark import SparkContext, SparkConf
> sc = SparkContext()'
2020-06-03 05:37:37 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2020-06-03 05:37:38 WARN  Utils:66 - Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
2020-06-03 05:37:38 WARN  Utils:66 - Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
2020-06-03 05:37:38 WARN  Utils:66 - Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
2020-06-03 05:37:38 WARN  Utils:66 - Service 'SparkUI' could not bind on port 4043. Attempting port 4044.
2020-06-03 05:37:38 WARN  Utils:66 - Service 'SparkUI' could not bind on port 4044. Attempting port 4045.
2020-06-03 05:37:38 WARN  Utils:66 - Service 'SparkUI' could not bind on port 4045. Attempting port 4046.
2020-06-03 05:37:38 WARN  Utils:66 - Service 'SparkUI' could not bind on port 4046. Attempting port 4047.
2020-06-03 05:37:38 WARN  Utils:66 - Service 'SparkUI' could not bind on port 4047. Attempting port 4048.

Also can’t do step 7 as throws same error…
E

I think the problem may still be the /etc/hosts file – mine looks like this:

$ cat /etc/hosts
127.0.0.1	localhost
127.0.0.1   MY_HOSTNAME

Yours probably just has the first line, and this isn’t good enough for spark, it seems.

Can you try adding the second one?

Mine looks like this… seems to be a mess!

127.0.0.1	localhost
255.255.255.255	broadcasthost
::1             localhost
127.0.0.1	wm06b-5c7

oh, that 4th line should be the one you need. That matches hostname right?

Yep it does!

Does this work?

import hail as hl
hl.init(conf={"spark.driver.bindAddress", "127.0.0.1"})
mt = hl.balding_nichols_model(n_populations=3, n_samples=50, n_variants=100)
mt.show()

Hey there. Yes it does, in that I can get a mt table to print out :smiley: but the second line still throws an error in my terminal:

In [1]: import hail as hl                                                                                                                                                                                                                     

In [2]: hl.init(conf={"spark.driver.bindAddress", "127.0.0.1"})                                                                                                                                                                               
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-bfbf3b0af7a7> in <module>
----> 1 hl.init(conf={"spark.driver.bindAddress", "127.0.0.1"})

TypeError: init() got an unexpected keyword argument 'conf'

Hope that helps and thanks again for all your assistance!
E