Develop locally, scale globally: Dask on Kubernetes with Google Cloud - GoDataDriven

We believe that using the cloud should not compromise on your productivity. In this blog post, we show how you can develop locally, while running your Dask workload in the cloud. You enjoy the comfort of you own IDE and local version control, while letting Google Kubernetes Engine (GKE) do the heavy lifting for you. You won’t even notice it.


This is a companion discussion topic for the original entry at https://godatadriven.com/blog/develop-locally-scale-globally-dask-on-kubernetes-with-google-cloud/

Great article - trying to implement this right now and having some issues.

I’m using the same exact setup as you are all using, and both the scheduler and the pod are created correctly.

I can proxy/port forward both the Service and the Pod, see the dashboard, and everything in between; however, it shows no workers, likely due to Dask not being able to attach the scheduler to the pod, here’s my error output:

However, I’m getting this error:
Creating scheduler pod on cluster. This may take some time.
Traceback (most recent call last):
File “/home/omarsumadi/.local/lib/python3.8/site-packages/distributed/comm/core.py”, line 286, in connect
comm = await asyncio.wait_for(
File “/usr/lib/python3.8/asyncio/tasks.py”, line 490, in wait_for
raise exceptions.TimeoutError()
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “create_cluster_basic.py”, line 36, in
cluster = KubeCluster(’./dask_yaml.yaml’, namespace=‘dask’, deploy_mode=“remote”)
File “/home/omarsumadi/.local/lib/python3.8/site-packages/dask_kubernetes/core.py”, line 466, in init
super().init(**self.kwargs)
File “/home/omarsumadi/.local/lib/python3.8/site-packages/distributed/deploy/spec.py”, line 281, in init
self.sync(self._start)
File “/home/omarsumadi/.local/lib/python3.8/site-packages/distributed/deploy/cluster.py”, line 189, in sync
return sync(self.loop, func, *args, **kwargs)
File “/home/omarsumadi/.local/lib/python3.8/site-packages/distributed/utils.py”, line 351, in sync
raise exc.with_traceback(tb)
File “/home/omarsumadi/.local/lib/python3.8/site-packages/distributed/utils.py”, line 334, in f
result[0] = yield future
File “/home/omarsumadi/.local/lib/python3.8/site-packages/tornado/gen.py”, line 762, in run
value = future.result()
File “/home/omarsumadi/.local/lib/python3.8/site-packages/dask_kubernetes/core.py”, line 595, in _start
await super()._start()
File “/home/omarsumadi/.local/lib/python3.8/site-packages/distributed/deploy/spec.py”, line 314, in _start
await super()._start()
File “/home/omarsumadi/.local/lib/python3.8/site-packages/distributed/deploy/cluster.py”, line 73, in _start
comm = await self.scheduler_comm.live_comm()
File “/home/omarsumadi/.local/lib/python3.8/site-packages/distributed/core.py”, line 746, in live_comm
comm = await connect(
File “/home/omarsumadi/.local/lib/python3.8/site-packages/distributed/comm/core.py”, line 308, in connect
raise IOError(
OSError: Timed out trying to connect to tcp://dask-omarsumadi-a409800b-1.dask:8786 after 10 s
ERROR:asyncio:Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7f75d0064310>
ERROR:asyncio:Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7f75d0064eb0>
ERROR:asyncio:Unclosed connector
connections: [’[(<aiohttp.client_proto.ResponseHandler object at 0x7f75d0065580>, 63182.628892771)]’]
connector: <aiohttp.connector.TCPConnector object at 0x7f75d0064c40>
ERROR:asyncio:Unclosed connector
connections: [’[(<aiohttp.client_proto.ResponseHandler object at 0x7f75cbfc9220>, 63182.961875471)]’]
connector: <aiohttp.connector.TCPConnector object at 0x7f75d0064f40>