关于centOS5.5 CMAN服务启动失败,并GFS文件挂再不上

本帖最后由 chinaboycj 于 2011-02-23 22:46 编辑

大家好,我的系统是centOS 5.5 kvm  一共6台多的cluster,6台都采用共享同一存储,以前一直都正常的。昨天上去一看,用df -h没有挂载存储上的磁盘,重新用手工mount 也不行,重启服务出现如下错误:不知道什么问题,其他5台都正常,用df -h都能正常看得到存储上的磁盘。谢谢大家。

[root@KVMHost-B8AC6FD29380 ~]# /etc/init.d/cman start
Starting cluster:
   Loading modules... done
   Mounting configfs... done
   Starting ccsd... done
   Starting cman... done
   Starting daemons... done
   Starting fencing... failed

                                                           [FAILED]




Feb 23 21:55:15 KVMHost-B8AC6FD29380 fenced[20413]: groupd is down, exiting
Feb 23 21:55:15 KVMHost-B8AC6FD29380 gfs_controld[20425]: groupd_dispatch error -1 errno 104
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SYNC ] This node is within the primary component and will provide service.
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node
Feb 23 21:55:16 KVMHost-B8AC6FD29380 gfs_controld[20425]: groupd connection died
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [TOTEM] entering OPERATIONAL state.
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 4
Feb 23 21:55:16 KVMHost-B8AC6FD29380 gfs_controld[20425]: cluster is down, exiting
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 3
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum lost, blocking activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 2
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum lost, blocking activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM  ] got nodejoin message 192.168.211.1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM  ] got nodejoin message 192.168.211.2
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM  ] got nodejoin message 192.168.211.3
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM  ] got nodejoin message 192.168.211.4
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM  ] got nodejoin message 192.168.211.5
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG  ] got joinlist message from node 1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG  ] got joinlist message from node 1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG  ] got joinlist message from node 2
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG  ] got joinlist message from node 3
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG  ] got joinlist message from node 4
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] cman killed by node 5 because we were killed by cman_tool or other application
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading all openais components
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading openais component: openais_confdb v0 (19/10)
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading openais component: openais_cpg v0 (18/
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading openais component: openais_cfg v0 (17/7)
Feb 23 21:55:43 KVMHost-B8AC6FD29380 ccsd[13186]: Unable to connect to cluster infrastructure after 342150 seconds.

作者: chinaboycj   发布时间: 2011-02-23

LZ 竟然用 KVM 搭建 6 节点集群,真牛逼啊

作者: nagaregawa   发布时间: 2011-02-24