Install NVIDIA Tesla GPU

GPU 安装与配置

2023-09-04

Install Hardware GPU(安装硬件)

本次安装GPU的服务器为DELL R730XD。安装步骤如下：

打开Riser 3 的卡扣
将 Riser 3取下
将 GPU 前面微微向下倾斜，将后边完全放入
再将前面提起，垂直插入riser 2 中
将GPU电源线连接好，另一端连接到riser 2上
装回riser 3
扣上riser 3的卡扣

注意：

不可用大力，轻轻操作
电源线可能会插错，如果插错，lspci有设备，单驱动不会识别，而且journalctl中有提示报错

Create a VM and GPU Passthrough setup(GPU直通设置)

https://www.wundertech.net/how-to-set-up-gpu-passthrough-on-proxmox/ https://gist.github.com/pangyuteng/f01dbc4d7713dcf3804548561512cf81 https://www.reddit.com/r/homelab/comments/b5xpua/the_ultimate_beginners_guide_to_gpu_passthrough/

Install GPU Driver（驱动安装）

本次安装的机器为PVE里面的一台Ubuntu。

方法一：使用Ubuntu的库安装

ubuntu-drivers devices
sudo apt install nvidia-driver-535
reboot

方法二：在NVIDIA官网下载驱动安装

GPU温度过高会导致服务器重启

如果GPU温度超过93度，服务器总线会收到错误消息，并且会直接重启。风扇需要至少开在50%，在训练过程中温度大约为50C左右。