解决Xend启动报错

     2014年03月10日       磊磊syh       运维笔记->xen虚拟化       xend 

Ubuntu上安装了Xend 4,执行sudo /etc/init.d/xend start 后报Fail。
于是查看xend的日志 /var/log/xen/xend.log ,发现如下报错:

[2014-03-09 22:22:40 3970] INFO (SrvDaemon:336) Xend changeset: unavailable.
[2014-03-09 22:22:40 3970] ERROR (SrvDaemon:349) Exception starting xend (Looped capability chain: 0000:03:00.0)
Traceback (most recent call last):
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/server/SrvDaemon.py", line 341, in run
    servers = SrvServer.create()
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/server/SrvServer.py", line 258, in create
    root.putChild('xend', SrvRoot())
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/server/SrvRoot.py", line 40, in __init__
    self.get(name)
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/web/SrvDir.py", line 84, in get
    val = val.getobj()
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/web/SrvDir.py", line 52, in getobj
    self.obj = klassobj()
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/server/SrvNode.py", line 30, in __init__
    self.xn = XendNode.instance()
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendNode.py", line 1181, in instance
    inst = XendNode()
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendNode.py", line 159, in __init__
    self._init_PPCIs()
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendNode.py", line 282, in _init_PPCIs
    for pci_dev in PciUtil.get_all_pci_devices():
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/util/pci.py", line 474, in get_all_pci_devices
    return map(PciDevice, get_all_pci_dict())
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/util/pci.py", line 699, in __init__
    self.get_info_from_sysfs()
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/util/pci.py", line 1269, in get_info_from_sysfs
    self.find_capability(0x11)
  File "/usr/lib/xen-4.1/bin/../lib/python/xen/util/pci.py", line 1236, in find_capability
    ('Looped capability chain: %s' % self.name))
PciDeviceParseError: Looped capability chain: 0000:03:00.0

看报错的trace信息发现是/usr/lib/xen-4.1/bin/../lib/python/xen/util/pci.py 的find_capability 抛出了exception而没有被catch住。
在网上找到了关于这个问题的一个bug https://bugzilla.redhat.com/show_bug.cgi?id=767742#c3

通过lspci命令可以找到导致出现looped报错的设备信息

sun@localhost:~$ lspci -vvv -xxx -s 0000:03:00.0
03:00.0 SD Host controller: Ricoh Co Ltd MMC/SD Host Controller (rev ff) (prog-if ff)
    !!! Unknown header type 7f
    Kernel modules: sdhci-pci
00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

当前一种比较dirty的处理方式是修改/usr/lib/xen-4.1/bin/../lib/python/xen/util/pci.py 脚本,对调动用find_capability的代码进行try catch
具体path可以参见 https://bugzilla.redhat.com/attachment.cgi?id=546883&action=diff