I have device controlled by NXP LS1043 64-bit ARM core CPU. Custom board. CPU connected via PCI 3.0 x 1 (2.5GT/s) to FPGA.
On cold start CPU loaded without PCI device on FPGA side, flash FPGA and bring-up PCI driver. For first try – it working properly.
Next, I reset FPGA and re-flash firmware. PCI link at this time physically DOWN, logically – not changed (if I do /sys/bus/pci/rescan – device respond that all OK, but it is not true (FPGA in reset state)).
After FPGA re-start, I user /sys/bus/pci/devices/%dev_num%/remove; /sys/bus/pci/rescan – to restart kernel driver and it working properly.
BUT, after first driver remove – device start randomly catch kernel segfaults. Mostly in network stack (as mostly used part).
BEFORE first driver remove-probe – there are no errors.
Removed all logic from driver. In probe/remove functions ONLY mem allocation for private struct. Problem still here.
It’s not my first driver (but first for PCI dev), but my first driver that called those type of problem.
Question: what can I do to detect problem for using shared resources? What kind of resources I can unreasonably released?
billy_herrington is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2