Unable to run vGPU manager on Dell R740 and ESX
Currently we get almost daily complaints from customers that they are stuck with their GRID deployment on the new Dell R740.
Symptom:
After installing the vGPU manager VIB file the vGPU manager doesn't run properly.
nvidia-smi is not working:
[root@esx65:~] nvidia-smi
Failed to initialize NVML: Unknown Error
[root@esx65:~]
[root@esx65:~] vmkload_mod nvidia
Module nvidia loaded successfully
[root@esx65:~] lspci | grep -i nvidia
0000:3d:00.0 Display controller: NVIDIA Corporation NVIDIATesla M10 [vmgfx0]
0000:3e:00.0 Display controller: NVIDIA Corporation NVIDIATesla M10 [vmgfx1]
0000:3f:00.0 Display controller: NVIDIA Corporation NVIDIATesla M10 [vmgfx2]
0000:40:00.0 Display controller: NVIDIA Corporation NVIDIATesla M10 [vmgfx3]
[root@esx65:~]
If you run dmesg command you may see an issue with IOMMU:

This indicates that there is an issue with IOMMU settings in the SBIOS
Solution:
The default settings for IOMMU need to be modified:
New settings should look like this:

The MMIO Base value needs to be less than 16TB.
For detailed explanation why this is necessary you can have a look here: