.. _nvidia-docker: NVIDIA Docker in ADE ^^^^^^^^^^^^^^^^^^^^ .. warning:: Support for NVIDIA Docker is deprecated and will be dropped in the next release of ``ade-cli`` `NVIDIA Docker`_ allows developers to leverage NVIDIA GPUs inside Docker containers. ``ade start`` will automatically detect if NVIDIA docker is installed by checking if ``/dev/nvidia0`` exists, adding the necessary ``docker run`` arguments to make the GPUs available inside the container. **NVIDIA Docker has been deprecated as of Docker version 19.03, so** ``ade-cli`` **may drop support for NVIDIA Docker in future releases.** .. _NVIDIA Docker: https://github.com/NVIDIA/nvidia-docker ADE - NVIDIA Docker Compatibility """"""""""""""""""""""""""""""""" The following table describes compatible versions of ADE, NVIDIA Docker, and the container's base image: +-----------------+---------------------------------------+ | | **Container Base Image** | | ADE Version +-------------------+-------------------+ | > 3.4.1 | ``ubuntu:xenial`` | ``ubuntu:bionic`` | +-------------+---+-------------------+-------------------+ | | 1 | Supported | Supported | | **NVidia** | | | | | **Docker** +---+-------------------+-------------------+ | **Version** | 2 | Supported with a | Supported | | | | `workaround`_ | | +-------------+---+-------------------+-------------------+ - **Note**: ADE Version <= 3.4.1 only supports NVIDIA Docker 1 - **Note**: ADE Version >= 4.1.0 does not require NVIDIA Docker, if Docker 19.03 or newer is installed Troubleshooting ADE with NVIDIA Docker """""""""""""""""""""""""""""""""""""" 1. ADE fails to start with the following message: .. code:: bash $ ade start ... subprocess.CalledProcessError: Command 'curl -s http://localhost:3476/docker/cli' returned non-zero exit status 7 - This error usually means that you are trying to use NVIDIA Docker 2 with ADE Version <= 3.4.1 - **Fix:** Upgrade ADE or ``export ADE_DISABLE_NVIDIA_DOCKER=1`` - Note that if NVIDIA Docker is disabled, GUIs will not work 2. ADE starts successfully, but fails to open GUIs: e.g. .. code:: bash ade$ glxgears libGL error: No matching fbConfigs or visuals found libGL error: failed to load driver: swrast - This error means that the base image does not have the correct libGL libraries available - The error could occur if ``ADE_DISABLE_NVIDIA_DOCKER`` is set before ``ade start`` - **Fix:** Exit ADE, ``unset ADE_DISABLE_NVIDIA_DOCKER``, and restart ADE (``ade start -f``) - More often, the error occurs when trying to use NVIDIA Docker 2 with an ``ubuntu:xenial`` image. - See :ref:`workaround` for details on how to work around this issue .. _workaround: NVIDIA Docker 2 with an ``ubuntu:xenial`` image ''''''''''''''''''''''''''''''''''''''''''''''' This section applies to developers who are trying to run ADE with the following versions: - ADE > 3.4.1 - NVIDIA Docker 2 - ``ubuntu:xenial``-based base image A workaround is needed to get NVIDIA Docker 2 to work with an ``ubuntu:xenial``-based Docker container because: 1. The libGL libraries that NVIDIA Docker 2 expects are different from the libGL libraries shipped with Ubuntu Xenial 2. NVIDIA Docker 1 ships the expected libraries as a volume, but NVIDIA Docker 2 does not Therefore, the workaround is to load the expected libGL libraries into the Ubuntu Xenial container using an ADE volume. Add ``registry.gitlab.com/apexai/ade-nvidia-cudagl:latest`` to the ``.aderc`` configuration. e.g. .. code:: bash export ADE_IMAGES=" xenial_base_image:latest registry.gitlab.com/apexai/ade-nvidia-cudagl:latest " For more details on the volume, see the `ade-nvidia-cudagl`_ project. .. _ade-nvidia-cudagl: https://gitlab.com/ApexAI/ade-nvidia-cudagl