Progress output when running on Jupyter notebook

Yes, by default you’ll be running Spark with the configuration local[*], which will use all available CPUs.

I’d recommend also configuring memory as here: Java Heap Space out of memory - #6 by danking

The default Spark memory settings might be throttling the computation a bit.

1 Like