August 21, 2008
sneaky pci_dev refcounts
Normal friends and family, just skip this post. Trust me.
Recently, while doing some PCI slot symlink work, I noticed that we weren’t calling pci_release_dev() after taking a card offline (via sysfs). Well, at least not the first time, due to a leaked refcount in pci_get_dev_by_id(). gregkh fixed it, so that’s good, but I had a slightly difficult time tracking down the exact problem.
See, I had instrumented pci_dev_get() and pci_dev_put() to call dump_stack() every time the device that I cared about was touched:
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -789,6 +789,11 @@ struct pci_dev *pci_dev_get(struct pci_dev *dev)
{
if (dev)
get_device(&dev->dev);
+ if (dev && (dev->bus->number == 1) && (PCI_SLOT(dev->devfn) == 2)) {
+ dump_stack();
+ printk(KERN_INFO "dev refcount: %d\n",
+ atomic_read(&dev->dev.kobj.kref.refcount));
+ }
return dev;
}
[mutatis mutandis for pci_dev_put()]
But I wasn’t getting all the information I needed. Other pieces of code were playing with that device’s refcount too. So I did this instead (thanks to willy for the hint!):
@@ -983,7 +984,20 @@ int device_register(struct device *dev)
*/
struct device *get_device(struct device *dev)
{
- return dev ? to_dev(kobject_get(&dev->kobj)) : NULL;
+
+ if (!dev)
+ return NULL;
+
+ if (dev->bus == &pci_bus_type) {
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+ if ((pci_dev->bus->number == 1) && (PCI_SLOT(pci_dev->devfn) == 2)) {
+ dump_stack();
+ printk(KERN_INFO "dev refcount: %d\n",
+ atomic_read(&pci_dev->dev.kobj.kref.refcount));
+ }
+ }
+
+ return to_dev(kobject_get(&dev->kobj));
}
Armed with that patch, this is the list of code paths (as of 2.6.27) that can affect a struct pci_dev’s refcount:
* pci_dev_get * pci_dev_put * device_add -> get_device * device_add -> klist_add_tail -> ... -> get_device * device_del -> put_device * device_del -> klist_del -> ... -> put_device * bus_attach_device -> klist_add_tail -> ... -> get_device * bus_remove_device -> klist_del -> ... -> put_device * acpi_platform_notify -> acpi_bind_one -> get_device
There are lots of code paths that can potentially directly call device_add/del so if you’re debugging that, you’ll need to figure out which ones affect you. I included that acpi_bind_one() as an example of a surprising place where we might directly call get_device().
The only surprising gotcha might be the calls to klist_add_tail/klist_del. Those calls will be non-obvious to grep for while debugging your device’s lifetime. So if you are wondering why your struct pci_dev isn’t calling its ->release() properly, just patch get_device() directly, and be prepared to stare at the output for a while. Good luck.
[Note that device_add bumps the refcount by 2, but then calls put_device() during cleanup, for a net gain of +1.]





Leave a Reply