Jupyter + Dask

Getting Started

Select Jupyter + Dask as the plugin. Within the application parameters, select Python Version and Type for Jupyter and Worker Processes and Python Version for Dask. Note: Worker Processes == 0 is the same as starting 1 process for each processor on the node.

The Dask Diagnostic Dashboard & a Jupyter Notebook/Lab will launch in separate browsers.

The Dask Diagnostic Dashboard is an interactive dashboard with plots and tables providing live information on the workers. There are tabs with information about task runtimes, communication, statistical profiling, load balancing, memory use, and more.

Dask Diagnostic Dashboard

Jupyter includes Dask as well as the deep learning packages: Keras, PyTorch, TensorFlow, & much more.

Below is a simple example illustrating the limitiations of the numpy library and how switching to a dask array enables a larger array. Dask provides similar scaling solutions to pandas and scikit-learn using Dask dataframes and Dask-ML.

Dask Array

Version Options & Configurations

The Jupyter application includes Conda with Python 2.7, 3.6, 3.7 & 3.8.

Node Type CPU GPU
Python 3.6 / 3.7 / 3.8 3.6 / 3.7 / 3.8
CudaToolKit N/A 11.0.221
cuDNN N/A 8.0.4
TensorFlow 2.4.1 2.4.0
PyTorch 1.7.0 1.7.0

External References

For more information on how to use Dask, please visit dask.org

For more information on how to use the Jupyter family of products, please visit jupyter.org