So far, I can ping a virtual ip, and manually relocate it between the nodes, but I didn't figure out, how to do this automatically. So this is my question: How can I setup the cluster, to it automatically failover the a service to another node case one node fails?
I have created a simple menu driven script for our Operations to take care of the basic monitoring and managing of our production application from the back-end. Now, the script when tested in UAT environment was fine, but when deployed to production it kind of behaved oddly.hen the Operator chooses an option from the menu he is given the output and at the end is prompted to return to the main menu by using ctrl+c. In production, this return does not occur for some strange reason and the program just sits there.The session becomes unresponsive after that and I'm forced to terminated it by closing the PuTTY.I tried enabling the debug mode too (set -x) and still was not able to find any useful hints/trails as to why.
I don't have much experience in clustering. And I'm deploying a cluster system on CentOS.But I don't know how long a node failover and another node take over those resouces to continue running service is good, fast or slow? 1s, 10s or
I have a Node mounted from my Appserver (Solaris) to DBserver (Solaris), the reason why I Mount is that My Oracle writes file using UTIL_File in Dbserver only, so now I done the Mount and I can create file using VI in the Mounted point. But My UTIL_file is not able to create a file, the reason might be that Oracle writes only as ORA user and my Appserver has no such user, for that I have given the permission 777 for that particular folder, but no use, so I wonder do I need additional permission for this.
I'm building a 3-node cluster. I have created ocfs filesystems and mounted them on the first two nodes. But while mounting them on the third node , i'm getting this error for 8 of the total luns. all these 8 luns are of 1GB size.
I've unmounted these 8 luns from the other node and tried to mount in the third node ... and then it was working and again the error occurs in the second node. My observation was for some reason these particular luns are not allowing the third node.
mount.ocfs2: Invalid argument while mounting /dev/mapper/voting1 on /voting1. Check 'dmesg' for more information on this error.
I want to configure two node cluster for qmail-toaster?? My idea is.. if one server hardware gets failed it should transfer/migrate service to other qmail-toaster server with all settings like ... domains/users/password etc etc.
I was given task to install redhat linux os on one of the compute node on server which doesn't have cd/dvd drive or usb port.I have installation media as well ISO image. This server is on network, so I can access it via my PC which is running window 7.I think, I have 2 choice to install:1. Copy iso image to head node on server and then install linux os on compute node via nfs.r2. Use my PC dvd drive to install linux on compute node via network.But I don't know how to do it.
I am (and still) trying to create a 2 Node cluster on Centos5.2 with a Dell MD3000 as a storage. However I am getting this when I try to probe for storages in luci: An error has occured while probing storage:
I have a machine that has an integrated USB flash drive somewhere in it that I can not seem to mount. The OS knows it is there, as I can see it in the dmesg output, but the actual "node" in /dev doesn't actually exist. I'm baffled. Will you help me figure this out? Here's what I came up with so far. The disk is attached to the USB bus, used as a SCSI block device. In dmesg, it is referred to at one point as sdf, as well as under /sys/class/scsi_device. When I go check /dev, /dev/sdf does not exist!:
I have a rhel cluster with two nodes. cluster is working fine.But suddunly for today morning im not able to ssh to once of the nodes with follwing error.ebug2: we sent a hostbased packet, wait for replyConnection closed by 10.125.104.162After some not able with in the node and after some time second node also started behaving like this. Now ssh with in the nodes and between nodes is not happening. But i am able to putty session.Error from /var/log/messages.vsftpd(pam_unix): authentication failure; logname= uid=0 euid=0 tty= ruser= rhost=
after few days of hard work about redhat cluster and piranah, i have done 90% hopefully,but i am stuck with iptables rulesi am attaching full piranah server screen shot of my network .please have a look and please tel me, what else to do in piranah server ...a) from firewall (Lynksys) what ip shall i forward port 80 ?? ( 192.168.1.66 or 192.168.1.50 ??)b) Currently its looks like http request is not forwarding from Virtual server to real server , what iptables rules shall i write ?(Please have a look to iptables rules)also, this link for my piranah server setupi guess i am stuck somewhere where i need some experts eye to catch it upso please look at the all the pictures , ifconfig and iptables rulesifconfig :
ifconfig eth0 Link encap:Ethernet HWaddr 00:0F:3D:CB:0A:8C inet addr:192.168.1.66 Bcast:192.168.1.255 Mask:255.255.255.0
i have a two node drbd cluster, while drbd1 is primary and drbd2 is slave i should make drbd1 slave and unmount drbd partition and make make drbd2 primary and mount drbd partition to see content on drbd2is it any way to automate it so as drbd1 goes down, drbd1 make itself primary and mount the partition?
I've configured drbd with Heartbeat. The nodes are connected at first, but when I issue "/usr/lib/heartbeat/hb_standby" on node 1, node 2 won't take over, and after a while, both nodes become "WFConnection".
I figure this happens because I created "dev/drbd0" on both nodes! Do you know how to remove it? I googled it but got nothing.
I am having an issue with LVM on a 2 node cluster. We are using powerpath for external drives. We had a request to increase /apps filesystem which is EXT3.
On the first node we did: pvcreate /dev/emcpowercx1 and /dev/emcpowercw2 Then.... vgextend apps_vg /dev/emcpowercw2 /dev/emcpowercx1 lvresize -L +60G /dev/apps_vg/apps_lv resize2fs /dev/apps_vg/apps_lv Everything went on well , we got the /apps increased. But on the second node when I do pvs.
I am getting following error: WARNING: Duplicate VG name apps_vg: RnD1W1-peb1-JWay-MyMa-WJfb-41TE-cLwvzL (created here) takes precedence over ttOYXY-dY4h-l91r-bokz-1q5c-kn3k-MCvzUX How can I proceed from here?
I am running a 5 node 11g RAC cluster with cluster ready services. The hardware is HP BL460g6 with virtual connect modules on each chassis. The OS is RHEL5.5, 2 Qlogic 4G fibre cards, 2 Broadcom NICs bonded, and the backend disk is all Hitachi 15k. The issue that I'm seeing is periodically each day there are spikes in I/O waits for disk writes, and sometimes that will cause a node or 2 to reboot when CRS can't communicate with the other nodes. Both nodes that have been evicted, are on the same chassis. What I've checked is, the backend storage is not seeing ANY high utilization, in fact it's running at less than 10% all the time. The network is 10G, and is not showing errors on the switch, or on the blades themselves. The redo logs have a 10M buffer cache, and are running on ASM disk. What other information would be beneficial to check? I am seeing nothing in the logs as to any errors, or waits for disk writes. I believe it's a software issue, but am lost as to how to prove it. These aren't the only nodes experiencing periodic high I/O waits, and it happens at the exact same time on all systems, whether they run ASM or are part of a cluster.
I want to change my servers node name which is the output of "#uname -n"Server is CentOS 5I searched but couldn't find. There was some search results about /etc/nodename but I don't have a file at that path. Also some said uname -S which doesn't work.
On a HA Cluster, the 2nd node noticed that the 1st node was down, but the first node wasn't down. This resulted in that the 2nd node tried to take over the resources but failed, because the resource was stil in use by the first node. This caused that the first node was left behind in a fuzzy state. I had no other choice to kill the heartbeat service, and reboot the server to solve the issue. There where no network issues, all hardware is ok. Are there any bugs know? Is there a way to avoid it from happening again?
there are 3 nodes..A,B and C. Node A wants to send information to Node B but it does so by sending it to Node C first which then sends to B. And similarly Node B sends to A. In this simultaneously C doesn't send to both A and B. This is the in built algorithm..but i want to change it to: A and B send their packets to C and C sends both these packets to A and B..by ORing... In the receiver side...node A can receive the wanted packet and also B. Where do i change the algorithm?
I made a diskless image against Fedora15, during the boot I found it displayed the following error message and went into emergency mode.
The error message:
Starting Relabel all filesystems, if necessary ^[[1;31maborted^[[0m because a dependency failed.^M [ 107.607155] systemd: Job fedora-autorelabel.service/start failed with result 'dependency'.^M Starting Mark the need to relabel after reboot ^[[1;31maborted^[[0m because a dependency failed.^M
I have lack of understanding of CentOS in general. I have looked for a remedy on other forums and google, but haven't been able to find the answer. I have a 3 node cluster that was functioning great until I decided to go offline for awhile. My config is as follows: node 2: vh1 node 3: vh2 node 4: vh6 All nodes connect to a common shared area on an iscsi device (vguests_root)
Currently vh2 and vh6 connect great, however since putting the machines back online I can no longer connect with vh1. A dmesg command on vh1 reveals the following: GFS2: fsid=: Trying to join cluster "lock_dlm", "Cluster1:vguest_roots" GFS2: fsid=Cluster1:vguest_roots.2: Joined cluster. Now mounting FS... GFS2: fsid=Cluster1:vguest_roots.2: can't mount journal #2 GFS2: fsid=Cluster1:vguest_roots.2: there are only 2 journals (0 - 1) .....
I am using Centos. I have read places that you can use Drbd + heartbeat + nfs to make a simple failover NFS server.I can't find any document that works though. I've tried 20 or so, including some Debian ones.So, does anyone have any other ideas of how to do this?Point me in the right direction please.I want 2 nodes. One to be actively serving an NFS share The other to be ready for failover. If the first one goes out, the second takes over.Meaning, the filesystem is in sync, the IP must change, and NFS must come up