Cannot access more than a few virtual functions using SRIOV
Working with SRIOV imposes quite some challenges - all of them can be solved, still you will face a rather high complexity as it imposes a rather deep understanding of computer architectures when it comes to debugging.
We had one case on a test setup that had issues creating more than a few (in this case six virtual functions - or to be more precise: interacting with those).
Unbinding virtual functions for interface enp5s0f0np0
Unbinding virtual function at interface enp5s0f0np0, address 0000:05:00.2
Unbinding virtual function at interface enp5s0f0np0, address 0000:05:00.3
Unbinding virtual function at interface enp5s0f0np0, address 0000:05:00.4
Unbinding virtual function at interface enp5s0f0np0, address 0000:05:00.5
Unbinding virtual function at interface enp5s0f0np0, address 0000:05:00.6
Unbinding virtual function at interface enp5s0f0np0, address 0000:05:00.7
Unbinding virtual function at interface enp5s0f0np0, address 0000:05:01.0
script configuring the VFs
As we're working with VFs and OVS, we obviously want to enter eswitch mode on the NIC - to do so we need to unbind VFs before changing the mode.
Basically:
- Configure VF count
- Unbind VFs
- Configure physical function settings (like eswitch mode)
- Bind VFs
- Configure VFs
So - the failure has explicitly seen when we switch to another device on the PCI bus (as the VFs from 0000:05:00.2 to 0000:05:00.7 worked).
The reason has been simple: SRIOV had been missing in the BIOS.
But it worked, somewhat - did it?
The much more interesting thing - why did it work partially?
This is where SRIOV comes into play. SRIOV allows one PCI device to show up as multiple.
Check out the exact failure from above - as long as we have been staying in the same device, all has been working and we could interact with the functions. But spanning one device across multiple devices on the bus it failed.
A nice example to see how SRIOV works on the bus and explain a little more what's the concept behind it.