NVIDIA Docker in ADE

Warning

Support for NVIDIA Docker is deprecated and will be dropped in the next release of ade-cli

NVIDIA Docker allows developers to leverage NVIDIA GPUs inside Docker containers.

ade start will automatically detect if NVIDIA docker is installed by checking if /dev/nvidia0 exists, adding the necessary docker run arguments to make the GPUs available inside the container.

NVIDIA Docker has been deprecated as of Docker version 19.03, so ade-cli may drop support for NVIDIA Docker in future releases.

ADE - NVIDIA Docker Compatibility

The following table describes compatible versions of ADE, NVIDIA Docker, and the container’s base image:

ADE Version > 3.4.1

Container Base Image

ubuntu:xenial

ubuntu:bionic

NVidia Docker Version

1

Supported

Supported

2

Supported with a workaround

Supported

  • Note: ADE Version <= 3.4.1 only supports NVIDIA Docker 1

  • Note: ADE Version >= 4.1.0 does not require NVIDIA Docker, if Docker 19.03 or newer is installed

Troubleshooting ADE with NVIDIA Docker

  1. ADE fails to start with the following message:

$ ade start
...
subprocess.CalledProcessError: Command 'curl -s http://localhost:3476/docker/cli' returned non-zero exit status 7
  • This error usually means that you are trying to use NVIDIA Docker 2 with ADE Version <= 3.4.1
    • Fix: Upgrade ADE or export ADE_DISABLE_NVIDIA_DOCKER=1

    • Note that if NVIDIA Docker is disabled, GUIs will not work

  1. ADE starts successfully, but fails to open GUIs: e.g.

ade$ glxgears
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
  • This error means that the base image does not have the correct libGL libraries available

  • The error could occur if ADE_DISABLE_NVIDIA_DOCKER is set before ade start
    • Fix: Exit ADE, unset ADE_DISABLE_NVIDIA_DOCKER, and restart ADE (ade start -f)

  • More often, the error occurs when trying to use NVIDIA Docker 2 with an ubuntu:xenial image.

NVIDIA Docker 2 with an ubuntu:xenial image

This section applies to developers who are trying to run ADE with the following versions:

  • ADE > 3.4.1

  • NVIDIA Docker 2

  • ubuntu:xenial-based base image

A workaround is needed to get NVIDIA Docker 2 to work with an ubuntu:xenial-based Docker container because:

  1. The libGL libraries that NVIDIA Docker 2 expects are different from the libGL libraries shipped with Ubuntu Xenial

  2. NVIDIA Docker 1 ships the expected libraries as a volume, but NVIDIA Docker 2 does not

Therefore, the workaround is to load the expected libGL libraries into the Ubuntu Xenial container using an ADE volume. Add registry.gitlab.com/apexai/ade-nvidia-cudagl:latest to the .aderc configuration. e.g.

export ADE_IMAGES="
  xenial_base_image:latest
  registry.gitlab.com/apexai/ade-nvidia-cudagl:latest
"

For more details on the volume, see the ade-nvidia-cudagl project.