NVIDIA Docker in ADE¶
NVIDIA Docker allows developers to leverage NVIDIA GPUs inside Docker containers.
ade start will automatically detect if NVIDIA docker is installed by checking if /dev/nvidia0 exists, adding the necessary docker run arguments to make the GPUs available inside the container.
ADE - NVIDIA Docker Compatibility¶
The following table describes compatible versions of ADE, NVIDIA Docker, and the container’s base image:
ADE Version > 3.4.1 | Container Base Image | ||
ubuntu:xenial |
ubuntu:bionic |
||
NVidia Docker Version | 1 | Supported | Supported |
2 | Supported with a workaround | Supported |
- Note: ADE Version <= 3.4.1 only supports NVIDIA Docker 1
Troubleshooting ADE with NVIDIA Docker¶
- ADE fails to start with the following message:
$ ade start
...
subprocess.CalledProcessError: Command 'curl -s http://localhost:3476/docker/cli' returned non-zero exit status 7
- This error usually means that you are trying to use NVIDIA Docker 2 with ADE Version <= 3.4.1
- Fix: Upgrade ADE or
export ADE_DISABLE_NVIDIA_DOCKER=1
- Note that if NVIDIA Docker is disabled, GUIs will not work
- Fix: Upgrade ADE or
- ADE starts successfully, but fails to open GUIs: e.g.
ade$ glxgears
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
- This error means that the base image does not have the correct libGL libraries available
- The error could occur if
ADE_DISABLE_NVIDIA_DOCKER
is set beforeade start
- Fix: Exit ADE,
unset ADE_DISABLE_NVIDIA_DOCKER
, and restart ADE (ade start -f
)
- Fix: Exit ADE,
- The error could occur if
- More often, the error occurs when trying to use NVIDIA Docker 2 with an
ubuntu:xenial
image. - See NVIDIA Docker 2 with an ubuntu:xenial image for details on how to work around this issue
- More often, the error occurs when trying to use NVIDIA Docker 2 with an
NVIDIA Docker 2 with an ubuntu:xenial
image¶
This section applies to developers who are trying to run ADE with the following versions:
- ADE > 3.4.1
- NVIDIA Docker 2
ubuntu:xenial
-based base image
A workaround is needed to get NVIDIA Docker 2 to work with
an ubuntu:xenial
-based Docker container because:
- The libGL libraries that NVIDIA Docker 2 expects are different from the libGL libraries shipped with Ubuntu Xenial
- NVIDIA Docker 1 ships the expected libraries as a volume, but NVIDIA Docker 2 does not
Therefore, the workaround is to load the expected libGL libraries into the
Ubuntu Xenial container using an ADE volume.
Add registry.gitlab.com/apexai/ade-nvidia-cudagl:latest
to the .aderc
configuration. e.g.
export ADE_IMAGES="
xenial_base_image:latest
registry.gitlab.com/apexai/ade-nvidia-cudagl:latest
"
For more details on the volume, see the ade-nvidia-cudagl project.