Use an Nvidia GPU only for CUDA and Intel CPU for display
Build a budget 3090 deep learning workstation Part Two: use the GPU only for CUDA computing tasks not for display
Other parts of “Build a budget ML workstation”
- Part Zero: Random tidbits: a newbie’s story to build a budget machine learning workstation.
- Part One: Building: assembling the parts and how to deal with the bent of the big and heavy RTX 3090.
- Part Three: undervolt the GPU: we will see how to config the Ubuntu to achieve an undervolt effect on GPU, thus making the system more stable. This serves our need of training of models for a longer period of time.
- Bonus part: productivity for MAC users: as a long time MacOS user, we will learn how to config an almost MacOS like keyboard on Linux.
Configuring Ubuntu
The main goal of the software config is the stability, so that the workstation can sustain running for an extended period of time without getting thermal throttled or freezing. The first thing we do would be installing relevant packages and their dependencies, this instruction by Lambda would suffice:
The following instructions should work on Ubuntu 20.04.
How to use GPU only for CUDA computing not for display on Linux
The reason to use the integrated Intel graphics for display is speed. I found that the display will become noticeably laggy if GPU usage is 100% during computing if the GPU is used for display as well (DP connected with the monitor). Then you might wonder, why not simply connect the display cable (either HDMI or DP) to the motherboard (so the CPU’s integrated graphics is used)? and use Nvidia GPU’s CUDA whenever scientific computing packages are running…
It turned out, it was not that straightforward. The reason is, if we simply use this connection scheme: GPU is not used for display in the Nvidia X server setting, the CUDA cannot be used either…
(Update as of Sept 2022) In the new workstation in my office (12700k + NVIDIA A4000 16GB) that has Ubuntu 20.04 with the newest Nvidia driver version R515.x.xxx and CUDA Toolkit 11.7 (with PyTorch’s own CUDA 11.6), the follow tweak seems to be automatic whenever the integrated graphics is used, and no longer needs user’s attention.
The instruction I referred to is: Use integrated graphics for display and NVIDIA GPU for CUDA on Ubuntu 14.04.
Yet some of the info in the instruction is outdated, and does not apply to Intel 10-th gen CPUs, or any CPU with an Intel UHD 630 graphics. After installing the Nvidia VGA drivers and CUDA Toolkit. This is the
/etc/X11/xorg.conf
file I am using to make this happen:
Section "ServerLayout"
Identifier "layout"
Screen 0 "intel"
Screen 1 "nvidia"
EndSection
Section "Device"
Identifier "intel"
Driver "modesetting"
Option "AccelMethod" "glamor"
BusID "PCI:0:2:0"
Option "TearFree" "true"
Option "TripleBuffer" "true"
EndSection
Section "Screen"
Identifier "intel"
Device "intel"
EndSection
Section "Device"
Identifier "nvidia"
Driver "nvidia"
BusID "PCI:1:0:0"
EndSection
Section "Screen"
Identifier "nvidia"
Device "nvidia"
Option "AllowEmptyInitialConfiguration" "on"
Option "Coolbits" "28"
EndSection
What the setting above actually does is to let X11
config a second screen (which does not exist). The second screen will load the video card driver so that CUDA will be triggered.
The Option "Coolbits" "28"
part unlocks the fan setting in the Thermal setting option in the Nvidia X server settings, so that we can manually set certain threshold for the fan kicking offs. By default, the fan is off and the customized setting is grayed out. Having freedom to customize the fan profile is to avoid overheating the GDDR6x memory for an extended period of model training. Note that to make the changes in nvidia-settings
permanent, we have to add the following line as a start-up application in Ubuntu.
sh -c '/usr/bin/nvidia-settings --load-config-only'
Comments