关于centOS5.5 CMAN服务启动失败,并GFS文件挂再不上
本帖最后由 chinaboycj 于 2011-02-23 22:46 编辑
大家好,我的系统是centOS 5.5 kvm 一共6台多的cluster,6台都采用共享同一存储,以前一直都正常的。昨天上去一看,用df -h没有挂载存储上的磁盘,重新用手工mount 也不行,重启服务出现如下错误:不知道什么问题,其他5台都正常,用df -h都能正常看得到存储上的磁盘。谢谢大家。
[root@KVMHost-B8AC6FD29380 ~]# /etc/init.d/cman start
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... done
Starting daemons... done
Starting fencing... failed
[FAILED]
Feb 23 21:55:15 KVMHost-B8AC6FD29380 fenced[20413]: groupd is down, exiting
Feb 23 21:55:15 KVMHost-B8AC6FD29380 gfs_controld[20425]: groupd_dispatch error -1 errno 104
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SYNC ] This node is within the primary component and will provide service.
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node
Feb 23 21:55:16 KVMHost-B8AC6FD29380 gfs_controld[20425]: groupd connection died
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [TOTEM] entering OPERATIONAL state.
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 4
Feb 23 21:55:16 KVMHost-B8AC6FD29380 gfs_controld[20425]: cluster is down, exiting
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 3
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum lost, blocking activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 2
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum lost, blocking activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.2
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.3
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.4
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.5
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 2
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 3
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 4
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] cman killed by node 5 because we were killed by cman_tool or other application
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading all openais components
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading openais component: openais_confdb v0 (19/10)
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading openais component: openais_cpg v0 (18/
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading openais component: openais_cfg v0 (17/7)
Feb 23 21:55:43 KVMHost-B8AC6FD29380 ccsd[13186]: Unable to connect to cluster infrastructure after 342150 seconds.
大家好,我的系统是centOS 5.5 kvm 一共6台多的cluster,6台都采用共享同一存储,以前一直都正常的。昨天上去一看,用df -h没有挂载存储上的磁盘,重新用手工mount 也不行,重启服务出现如下错误:不知道什么问题,其他5台都正常,用df -h都能正常看得到存储上的磁盘。谢谢大家。
[root@KVMHost-B8AC6FD29380 ~]# /etc/init.d/cman start
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... done
Starting daemons... done
Starting fencing... failed
[FAILED]
Feb 23 21:55:15 KVMHost-B8AC6FD29380 fenced[20413]: groupd is down, exiting
Feb 23 21:55:15 KVMHost-B8AC6FD29380 gfs_controld[20425]: groupd_dispatch error -1 errno 104
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SYNC ] This node is within the primary component and will provide service.
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node
Feb 23 21:55:16 KVMHost-B8AC6FD29380 gfs_controld[20425]: groupd connection died
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [TOTEM] entering OPERATIONAL state.
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 4
Feb 23 21:55:16 KVMHost-B8AC6FD29380 gfs_controld[20425]: cluster is down, exiting
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 3
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum lost, blocking activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 2
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 kernel: dlm: closing connection to node 1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum lost, blocking activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] quorum regained, resuming activity
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.2
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.3
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.4
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CLM ] got nodejoin message 192.168.211.5
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 1
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 2
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 3
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CPG ] got joinlist message from node 4
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [CMAN ] cman killed by node 5 because we were killed by cman_tool or other application
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading all openais components
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading openais component: openais_confdb v0 (19/10)
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading openais component: openais_cpg v0 (18/
Feb 23 21:55:16 KVMHost-B8AC6FD29380 openais[20395]: [SERV ] Unloading openais component: openais_cfg v0 (17/7)
Feb 23 21:55:43 KVMHost-B8AC6FD29380 ccsd[13186]: Unable to connect to cluster infrastructure after 342150 seconds.
作者: chinaboycj 发布时间: 2011-02-23
LZ 竟然用 KVM 搭建 6 节点集群,真牛逼啊
作者: nagaregawa 发布时间: 2011-02-24