Red Hat / Fedora :: Cluster Suite When Node Power Off The Service Is Not Migrated?
Nov 20, 2009
I am using the Redhat Cluster Suite (luci and ricci) on my centos 5.4. i have 2 nodes in a cluster.I had clustered an apache server.The service is up end running and i can stop,start and switch on all two node.The problem is when i try to simulate a fault for one node.For example:The apache resource stay on the first cluster node.If i power off the first cluster node (not halt or init 0 but take off the eletric power off), the second cluster node not take the resource.With the clustat command, the service still running on the first node.But the service is down. The first node is dead.Only one the first node is join again the cluster the resource goes up on the second node.
So far, I can ping a virtual ip, and manually relocate it between the nodes, but I didn't figure out, how to do this automatically. So this is my question: How can I setup the cluster, to it automatically failover the a service to another node case one node fails?
I am trying to build GFS2 cluster with 2 or 3 Fedora 14 nodes, but I've encountered some problems from the start. First luci does not work at all in Fedora 14. There is no luci_admin and even if I manage to start luci service, I get a blank white screen when I try to open it from the browser. I've googled a bit and found that I'd might be able to setup GFS if I manage to build cluster.conf manually and start the cluster suite, but I cannot find documentation on how to create cluster.conf anywhere. If anyone knows how to setup GFS2 without a cluster suite or how to configure cluster.conf.
I am working in a project that needs to set up an Apache Web Cluster. The cluster needs to be High-availability (HA) cluster and Load-balancing cluster. Someone mentioned the use of Red Hat Cluster Suite, but, honestly, I can't figure out how it works, I haven't been able to configure it correctly. The project currently have two nodes, but we need to support at least three nodes in the cluster.
I have created a simple menu driven script for our Operations to take care of the basic monitoring and managing of our production application from the back-end. Now, the script when tested in UAT environment was fine, but when deployed to production it kind of behaved oddly.hen the Operator chooses an option from the menu he is given the output and at the end is prompted to return to the main menu by using ctrl+c. In production, this return does not occur for some strange reason and the program just sits there.The session becomes unresponsive after that and I'm forced to terminated it by closing the PuTTY.I tried enabling the debug mode too (set -x) and still was not able to find any useful hints/trails as to why.
I don't have much experience in clustering. And I'm deploying a cluster system on CentOS.But I don't know how long a node failover and another node take over those resouces to continue running service is good, fast or slow? 1s, 10s or
I have lack of understanding of CentOS in general. I have looked for a remedy on other forums and google, but haven't been able to find the answer. I have a 3 node cluster that was functioning great until I decided to go offline for awhile. My config is as follows: node 2: vh1 node 3: vh2 node 4: vh6 All nodes connect to a common shared area on an iscsi device (vguests_root)
Currently vh2 and vh6 connect great, however since putting the machines back online I can no longer connect with vh1. A dmesg command on vh1 reveals the following: GFS2: fsid=: Trying to join cluster "lock_dlm", "Cluster1:vguest_roots" GFS2: fsid=Cluster1:vguest_roots.2: Joined cluster. Now mounting FS... GFS2: fsid=Cluster1:vguest_roots.2: can't mount journal #2 GFS2: fsid=Cluster1:vguest_roots.2: there are only 2 journals (0 - 1) .....
I need to build a 3 node web server cluster to run a php application. Since the app requires users to login (which means a session state is to be maintained), I will be sharing sessions save path, I also need to share the application directory across 3 nodes. I having trouble deciding which cluster file system to select.
I was following [URL] for the cluster setup. But things dint work out. One Doubt to ask "Do I need the services cman, ccsd and rgmanager one at a time on the both machine. I have been running it through script.
I have just installed a two server cluster with ricci luci and conga on centos 5.6 32bit , both servers are vmware guests and have a shared storage disk connected to them both
with a GFS2 file system on them + fencing agents configured to work with VMware Vcenter.
(this is supported by vmware and works great on 4 other centos clusters i have been runing for 4 monthes with no CLVMD).
In this setup i used for the first time CLVMD as recommnded by RedHat so i could have the flexablitly of LVM under the GFS2 file system but , i have been getting some Strange problem with it , some times after a developer has done some IO heavy task like unziping a file or a simple TAR the load goes to 10 - 15 and no task can be killed , trying to reboot the server hangs.
After hard shutting the server every thing works ok until the next time some one does the same IO work as before.
i am using the redhat cluster suite (luci and ricci) on my centos 5.3. i have 2 nodes in a cluster. when i poweroff the first node on wich a vm service is running, the service switchtes to node2. so far, so good :) but when i restart node1 the service is not failback to node1! i have created a failover domain with both nodes and priorized whre node1 has prio1 and node2 has prio2.
This is my first post, and i have a question with the fence software for VMWare ESXi. The fence_vmware agent only works with ESX, and redhat (in you GIT repository) has submited a new agent called fence_vmware_ng that claims to work with ESXi. But the problem is that they do not specify the version that works with that. Anybody has test the fence_vmware_ng agent for VMWare ESXi 4.0 ?, i follow the instructions here: [URL]...and i can install the software, the API from VMWARE site, etc, but in the moment when i run the agent nothing happen, The agent connects to server, i see in logs, but the off-reboot-on operations not works. Only works status operations, that return the state of a virtual machine. I have CentOS 5.3 (fully today updated) with RHCS.
I have made a cluster between two server.In luci I can see that my cluster is green and the two nodes to.I have make an IP resource and associate it to a service : green : I can relocate the service from a node to the other one and the IP appears in the list of IP addresses The problem is that I have made the same in order to configure tomcat and postgresql and it does not work...I put my configuration only for ip and tomcat:
after few days of hard work about redhat cluster and piranah, i have done 90% hopefully,but i am stuck with iptables rulesi am attaching full piranah server screen shot of my network .please have a look and please tel me, what else to do in piranah server ...a) from firewall (Lynksys) what ip shall i forward port 80 ?? ( 192.168.1.66 or 192.168.1.50 ??)b) Currently its looks like http request is not forwarding from Virtual server to real server , what iptables rules shall i write ?(Please have a look to iptables rules)also, this link for my piranah server setupi guess i am stuck somewhere where i need some experts eye to catch it upso please look at the all the pictures , ifconfig and iptables rulesifconfig :
ifconfig eth0 Link encap:Ethernet HWaddr 00:0F:3D:CB:0A:8C inet addr:192.168.1.66 Bcast:192.168.1.255 Mask:255.255.255.0
I am having an issue with LVM on a 2 node cluster. We are using powerpath for external drives. We had a request to increase /apps filesystem which is EXT3.
On the first node we did: pvcreate /dev/emcpowercx1 and /dev/emcpowercw2 Then.... vgextend apps_vg /dev/emcpowercw2 /dev/emcpowercx1 lvresize -L +60G /dev/apps_vg/apps_lv resize2fs /dev/apps_vg/apps_lv Everything went on well , we got the /apps increased. But on the second node when I do pvs.
I am getting following error: WARNING: Duplicate VG name apps_vg: RnD1W1-peb1-JWay-MyMa-WJfb-41TE-cLwvzL (created here) takes precedence over ttOYXY-dY4h-l91r-bokz-1q5c-kn3k-MCvzUX How can I proceed from here?
In luci I can see that my cluster is green and the two nodes to. I have make an IP resource and associate it to a service : green : I can relocate the service from a node to the other one and the IP appears in the list of IP addresses
The problem is that I have made the same in order to configure tomcat and postgresql and it does not work...
I'm building a 3-node cluster. I have created ocfs filesystems and mounted them on the first two nodes. But while mounting them on the third node , i'm getting this error for 8 of the total luns. all these 8 luns are of 1GB size.
I've unmounted these 8 luns from the other node and tried to mount in the third node ... and then it was working and again the error occurs in the second node. My observation was for some reason these particular luns are not allowing the third node.
mount.ocfs2: Invalid argument while mounting /dev/mapper/voting1 on /voting1. Check 'dmesg' for more information on this error.
I am using Centos. I have read places that you can use Drbd + heartbeat + nfs to make a simple failover NFS server.I can't find any document that works though. I've tried 20 or so, including some Debian ones.So, does anyone have any other ideas of how to do this?Point me in the right direction please.I want 2 nodes. One to be actively serving an NFS share The other to be ready for failover. If the first one goes out, the second takes over.Meaning, the filesystem is in sync, the IP must change, and NFS must come up
I have two node redhat cluster for mysql database.The problem is that after updating the packages on both of the nodes after and previously the sevices was not able to relocated on second one , even rebooting the server the problem occurs.While starting the service on second node it started on the first one.Other services are running fine on both nodes.I have checked the /etc/hosts, bonding and many more files and seems to good.find the log for reference.
<notice> Starting stopped service service:hell Oct 22 14:35:51 indls0040 kernel: kjournald starting. Commit interval 5 seconds Oct 22 14:35:51 indls0040 kernel: EXT3-fs warning: maximal mount count reached, running
I'm having some trouble configuring clustering in a 2-node cluster, with no shared FS. Application is video streaming, so outbound traffic only...The cluster is generally ok - if I kill -9 one of the resource-applications, the failover works as expected. But it does not failover when I disconnect the power from the service owning node (simulating a hardware failure). clustat on the remaining node shows that the powered-down node has status "Offline", so it knows the node is not responding, but the remaining node does not become the owner, nor start up the cluster services/resource-applications. eth0 on each node is connected via a crossover cable for heartbeat, etc. Each eth1 connects to a switch.
I want to configure two node cluster for qmail-toaster?? My idea is.. if one server hardware gets failed it should transfer/migrate service to other qmail-toaster server with all settings like ... domains/users/password etc etc.
I was given task to install redhat linux os on one of the compute node on server which doesn't have cd/dvd drive or usb port.I have installation media as well ISO image. This server is on network, so I can access it via my PC which is running window 7.I think, I have 2 choice to install:1. Copy iso image to head node on server and then install linux os on compute node via nfs.r2. Use my PC dvd drive to install linux on compute node via network.But I don't know how to do it.
I am trying to rock cluster for the large computing. my all slave node connected with rock cluster master node. but I want to run the graphical application on the cluster node. I am not getting this point .
Using google with search option: cman not started: Can't find local node name in cluster.conf /usr/sbin/ cman_tool: aisexec daemon didn't start.I found this URL...I have found the config_version in cluster.conf. Unfortunately, as everyone may have noticed, english is not my native tongue so I am having trouble understanding the part "Make sure you bumped the cluster config version number". Can anyone enlightened me on what should I be doing so that I could "bump" the cluster config version?
I need to setup an linux cluster ..so i prefer ubuntu because of support and i personally i use ubuntu.. and can any one explain in breif ..what all the things needed to setup an ubuntu based cluster my configuration for each node will be (totally 6 nodes) core2 duo with 4 gb ram i need 4 nodes and 2 for load balancing..