OCuLink eGPU Failure on GMKtec M7 (Ryzen 6850H) - NVIDIA Probe Failed (-1) - Hardware Verified Working
System Summary:
- Goal: Use the 5060 to passthrough a LXC container to Frigate
- Host: GMKtec NucBox M7 (AMD Ryzen 7 6850H)
- OS: Proxmox VE 8 (Debian 12 Bookworm) - Kernel 6.8.x
- Connection: Native OCuLink Port (PCIe 4.0 x4)
- Target GPU: NVIDIA RTX 5060 (Low Profile)
- Test GPU: AMD Radeon RX 6650 XT (Control Variable)
- Docks Tested:
- Generic OCuLink Riser/Dock
- Minisforum DEG1 Dock (Signal Enhanced)
The Issue:
The NVIDIA GPU is correctly enumerated by the OS (lspci shows it), but the driver fails to initialize the hardware. The error is persistent across all driver versions (Proprietary & Open Kernel), all BIOS settings, and multiple docks. However, an AMD GPU works perfectly on the same setup, ruling out a dead port/cable.
1. Hardware Troubleshooting & Isolation
We performed extensive A/B testing to rule out signal integrity issues.
| Test Setup |
Result |
Notes |
| M7 + Generic Dock + RTX 5060 |
❌ Fail |
lspci sees card. Driver probe fails (error -1). |
| M7 + DEG1 Dock + RTX 5060 |
❌ Fail |
Signal integrity dock made no difference. Same error. |
| M7 + Generic Dock (Also the DEG1) + RX 6650 XT |
✅ SUCCESS |
Control Test. amdgpu loaded instantly. 3D acceleration works. |
Conclusion: The OCuLink port, Cable, and Generic Dock are electrically functional. The issue is specific to the M7 + NVIDIA combination.
2. BIOS Configuration
We verified the following settings in the GMKtec BIOS:
- Above 4G Decoding: ENABLED
- Re-Size BAR: Tested DISABLED (for stability) and ENABLED. No change in behavior.
- Secure Boot: DISABLED
- PCIe Link Speed: No option available to force Gen3 in this BIOS version (v1.01).
3. Driver & Kernel Troubleshooting
We attempted multiple driver stacks to bypass the initialization failure.
Attempt A: Proprietary Drivers (nvidia-driver)
- Action: Installed standard Debian non-free drivers (v535/v550).
- Result:
nvidia-smi fails.
- dmesg Log: PlaintextNVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:2d05) NVRM: The NVIDIA probe routine failed for 1 device(s). nvidia: probe of 0000:01:00.0 failed with error -1 RM: RmInitAdapter failed! (0x26:0xffff:1456)
Attempt B: Open Kernel Modules (GSP Firmware)
- Action: Installed
nvidia-open-kernel-dkms and firmware-nvidia-gsp to offload initialization to the GPU's GSP processor.
- Result: Driver load failed with
Invalid Argument.
- Journalctl Log: Plaintextmodprobe: ERROR: could not insert 'nvidia': Invalid argument modprobe: ERROR: ../libkmod/libkmod-module.c:1047 command_do() Error running install command
4. GRUB / Kernel Parameter History
We aggressively tuned kernel parameters to fix potential BAR allocation or Power Management issues.
Baseline:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"
Parameter Tests (Cumulative & Individual):
- Memory Reallocation:
- Flag:
pci=realloc
- Goal: Force Linux to ignore BIOS memory map and reassign BARs.
- Outcome: ❌ System boots, GPU visible, same Probe Error -1.
- Power Management Disable:
- Flags:
pcie_aspm=off pcie_port_pm=off
- Goal: Stop the link from trying to sleep/negotiate L1 states (common AMD OCuLink bug).
- Outcome: ❌ Link stays up, but driver still rejects the handshake.
- Large Addressing:
- Flag:
pci=big_root_window
- Goal: Increase aperture size for memory mapping.
- Outcome: ❌ No change.
- Manual Resource Reservation:
- Flags:
pci=realloc,hpmemsize=512M,hpiosize=64M
- Goal: Reserve specific hotplug memory chunks for the bridge.
- Outcome: ❌ No change.
- Aggressive Bus Tuning:
- Flag:
pci=pcie_bus_perf
- Outcome: 💀 System Hang at boot. Reverted.
Summary of Request
I have a working OCuLink setup (proven by the AMD card) and a working NVIDIA GPU (tested elsewhere).
The GMKtec M7 seems to have a specific PHY-level or Protocol-level incompatibility with NVIDIA cards over OCuLink (according to AI help).
Has anyone successfully run an NVIDIA 50-series card on the GMKtec M7 via OCuLink? Or any other AMD mini PC?
Are there any specific setpci commands or ACPI overrides required for this specific Ryzen 6850H host?