Use an Nvidia GPU only for CUDA and Intel CPU for display

3 minute read

Build a budget 3090 deep learning workstation Part Two: use the GPU only for CUDA computing tasks not for display

Other parts of “Build a budget ML workstation”

Configuring Ubuntu

The main goal of the software config is the stability, so that the workstation can sustain running for an extended period of time without getting thermal throttled or freezing. The first thing we do would be installing relevant packages and their dependencies, this instruction by Lambda would suffice:

Install TensorFlow & PyTorch for RTX 3090, 3080, 3070, etc.

The following instructions should work on Ubuntu 20.04.

How to use GPU only for CUDA computing not for display on Linux

The reason to use the integrated Intel graphics for display is speed. I found that the display will become noticeably laggy if GPU usage is 100% during computing if the GPU is used for display as well (DP connected with the monitor). Then you might wonder, why not simply connect the display cable (either HDMI or DP) to the motherboard (so the CPU’s integrated graphics is used)? and use Nvidia GPU’s CUDA whenever scientific computing packages are running…

It turned out, it was not that straightforward. The reason is, if we simply use this connection scheme: GPU is not used for display in the Nvidia X server setting, the CUDA cannot be used either…

(Update as of Sept 2022) In the new workstation in my office (12700k + NVIDIA A4000 16GB) that has Ubuntu 20.04 with the newest Nvidia driver version R515.x.xxx and CUDA Toolkit 11.7 (with PyTorch’s own CUDA 11.6), the follow tweak seems to be automatic whenever the integrated graphics is used, and no longer needs user’s attention.

The instruction I referred to is: Use integrated graphics for display and NVIDIA GPU for CUDA on Ubuntu 14.04.

Yet some of the info in the instruction is outdated, and does not apply to Intel 10-th gen CPUs, or any CPU with an Intel UHD 630 graphics. After installing the Nvidia VGA drivers and CUDA Toolkit. This is the /etc/X11/xorg.conf file I am using to make this happen:

Section "ServerLayout"
    Identifier "layout"
    Screen 0 "intel"
    Screen 1 "nvidia"
EndSection

Section "Device"
    Identifier "intel"
    Driver      "modesetting"    
    Option      "AccelMethod"    "glamor"
    BusID       "PCI:0:2:0"
    Option      "TearFree" "true"
    Option  "TripleBuffer" "true"
EndSection

Section "Screen"
    Identifier "intel"
    Device "intel"
EndSection

Section "Device"
    Identifier "nvidia"
    Driver "nvidia"
    BusID "PCI:1:0:0"
EndSection

Section "Screen"
    Identifier "nvidia"
    Device "nvidia"
    Option "AllowEmptyInitialConfiguration" "on"
    Option  "Coolbits" "28"
EndSection

What the setting above actually does is to let X11 config a second screen (which does not exist). The second screen will load the video card driver so that CUDA will be triggered.

The Option "Coolbits" "28" part unlocks the fan setting in the Thermal setting option in the Nvidia X server settings, so that we can manually set certain threshold for the fan kicking offs. By default, the fan is off and the customized setting is grayed out. Having freedom to customize the fan profile is to avoid overheating the GDDR6x memory for an extended period of model training. Note that to make the changes in nvidia-settings permanent, we have to add the following line as a start-up application in Ubuntu.

sh -c '/usr/bin/nvidia-settings --load-config-only'

Comments