CentOS 5 :: Northbridge - Node 2<0>K8 ECC Error ?

Feb 2, 2011

Getting the following errors on a new server:

They appear randomly while the system is idle and in use. The system configuration is as follows:

Running the latest Centos 5.5plus kernel (b/c we are using XFS)

The errors appear randomly several times a day. Tried removing some memory, but no combination sees to cause the error to stop.

I see this error posted in many places all over the web - some claim faulty Ram, faulty CPU, faulty mobo, and kernel bug.

Has anyone seen this error and been able to nail it down?

View 3 Replies

CentOS 5 Networking :: Setup The Cluster To Automatically Failover The Service To Another Node Case One Node Fails?

Mar 1, 2011

I am familiar with windows 2008 cluster servers, and I just started testing with centos cluster. I am creating a simple 2-node cluster, for a simple ping test.

Node 1: 10.0.0.1
Node 2: 10.0.0.1
Virtual ip: 10.0.0.10

So far, I can ping a virtual ip, and manually relocate it between the nodes, but I didn't figure out, how to do this automatically. So this is my question: How can I setup the cluster, to it automatically failover the a service to another node case one node fails?

View 1 Replies View Related

General :: Make A DRBD Node Start Itself As A Primary Node Automatically?

Jan 28, 2010

I've set up DRBD on 2 machines, 1 of them is the master, another is the slave.

After each bootup, I need to run the following on the master machine:

Code:
drbdadm -- --overwrite-data-of-peer primary all

Do we need to specify which machine should be the primary node every time? Is there any method to make the machine "know" it's itself the primary node?

View 1 Replies View Related

Programming :: KSH Script Behaving Differently On An HACMP Cluster Node (prod) & A Single Node (UAT)?

Dec 16, 2010

I have created a simple menu driven script for our Operations to take care of the basic monitoring and managing of our production application from the back-end. Now, the script when tested in UAT environment was fine, but when deployed to production it kind of behaved oddly.hen the Operator chooses an option from the menu he is given the output and at the end is prompted to return to the main menu by using ctrl+c. In production, this return does not occur for some strange reason and the program just sits there.The session becomes unresponsive after that and I'm forced to terminated it by closing the PuTTY.I tried enabling the debug mode too (set -x) and still was not able to find any useful hints/trails as to why.

View 5 Replies View Related

Server :: Node Failover And Another Node Take Over Resources On HA Cluster?

Oct 27, 2010

I don't have much experience in clustering. And I'm deploying a cluster system on CentOS.But I don't know how long a node failover and another node take over those resouces to continue running service is good, fast or slow? 1s, 10s or

View 2 Replies View Related

Server :: LVM On 2 Node Cluster - Getting Error

Oct 26, 2010

I am having an issue with LVM on a 2 node cluster. We are using powerpath for external drives. We had a request to increase /apps filesystem which is EXT3.

On the first node we did:
pvcreate /dev/emcpowercx1 and /dev/emcpowercw2
Then....
vgextend apps_vg /dev/emcpowercw2 /dev/emcpowercx1
lvresize -L +60G /dev/apps_vg/apps_lv
resize2fs /dev/apps_vg/apps_lv
Everything went on well , we got the /apps increased. But on the second node when I do pvs.

I am getting following error:
WARNING: Duplicate VG name apps_vg: RnD1W1-peb1-JWay-MyMa-WJfb-41TE-cLwvzL (created here) takes precedence over ttOYXY-dY4h-l91r-bokz-1q5c-kn3k-MCvzUX
How can I proceed from here?

View 1 Replies View Related

Red Hat / Fedora :: CentOS 5.4 GFS2 - 3 Node Cluster

Feb 19, 2010

I have lack of understanding of CentOS in general. I have looked for a remedy on other forums and google, but haven't been able to find the answer. I have a 3 node cluster that was functioning great until I decided to go offline for awhile. My config is as follows:
node 2: vh1
node 3: vh2
node 4: vh6
All nodes connect to a common shared area on an iscsi device (vguests_root)

Currently vh2 and vh6 connect great, however since putting the machines back online I can no longer connect with vh1. A dmesg command on vh1 reveals the following:
GFS2: fsid=: Trying to join cluster "lock_dlm", "Cluster1:vguest_roots"
GFS2: fsid=Cluster1:vguest_roots.2: Joined cluster. Now mounting FS...
GFS2: fsid=Cluster1:vguest_roots.2: can't mount journal #2
GFS2: fsid=Cluster1:vguest_roots.2: there are only 2 journals (0 - 1) .....

View 1 Replies View Related

CentOS 5 Server :: 2-node Non-shared-FS Cluster On 5.2/3?

Apr 15, 2009

I'm having some trouble configuring clustering in a 2-node cluster, with no shared FS. Application is video streaming, so outbound traffic only...The cluster is generally ok - if I kill -9 one of the resource-applications, the failover works as expected. But it does not failover when I disconnect the power from the service owning node (simulating a hardware failure). clustat on the remaining node shows that the powered-down node has status "Offline", so it knows the node is not responding, but the remaining node does not become the owner, nor start up the cluster services/resource-applications. eth0 on each node is connected via a crossover cable for heartbeat, etc. Each eth1 connects to a switch.

[root@lmshw01 ~]# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="lmshw-clust" config_version="35" name="lmshw-clust">

[code].....

View 3 Replies View Related

CentOS 5 Networking :: Rsh From One Node To Other Without Providing The Password

Feb 6, 2010

I trying to set up a cluster of 4 nodes each with CentOS 5.3,

i want to rsh from one node to other without providing the password.

so far i did the following:

1)configured /etc/hosts file on the nodes so that each node can be pinged from other.

$cat /etc/hosts
10.0.2.100 node1
10.0.2.101 node2
10.0.2.102 node3

[Code]....

but it still is Asking for the password!

I am not able to figure out what to do next ? Do i need to be at /home/mygrid location on node1 in order to rsh into node2 ?

View 4 Replies View Related

CentOS 5 Server :: Cluster Node Does Not Want To Cooperate

Sep 30, 2010

I have a two node cluster, and a third system which has luci installed.

node1 is nfs0
node2 is nfs1

both nodes have identically the same configuration. They have a fresh installation of Centos 5.5 + yum update. I am unable to join nfs1 to the cluster, as it is giving me the following issue:

Sep 29 23:28:00 nfs0 ccsd[6009]: Starting ccsd 2.0.115:
Sep 29 23:28:00 nfs0 ccsd[6009]: Built: Aug 11 2010 08:25:53
Sep 29 23:28:00 nfs0 ccsd[6009]: Copyright (C) Red Hat, Inc. 2004 All rights reserved.

[code].....

View 1 Replies View Related

General :: Error While Building A 3-node Cluster ?

Jan 6, 2010

I'm building a 3-node cluster. I have created ocfs filesystems and mounted them on the first two nodes. But while mounting them on the third node , i'm getting this error for 8 of the total luns. all these 8 luns are of 1GB size.

I've unmounted these 8 luns from the other node and tried to mount in the third node ... and then it was working and again the error occurs in the second node. My observation was for some reason these particular luns are not allowing the third node.

mount.ocfs2: Invalid argument while mounting /dev/mapper/voting1 on /voting1. Check 'dmesg' for more information on this error.

dmesg output was :

I'm using RHEL 5.4

View 2 Replies View Related

CentOS 5 :: Can't Find Local Node Name In Cluster.conf

Apr 21, 2009

Using google with search option: cman not started: Can't find local node name in cluster.conf /usr/sbin/ cman_tool: aisexec daemon didn't start.I found this URL...I have found the config_version in cluster.conf. Unfortunately, as everyone may have noticed, english is not my native tongue so I am having trouble understanding the part "Make sure you bumped the cluster config version number". Can anyone enlightened me on what should I be doing so that I could "bump" the cluster config version?

View 3 Replies View Related

Hardware :: Northbridge Need Drivers In Debian?

Jan 16, 2010

Northbridge need drivers in Debian? In Windows I use to install Northbridge, because in the cd with drivers I have the drivers.It is need to install the driver for Northbridge in Debian or it is lemeted to kernel's support?

View 8 Replies View Related

Ubuntu Servers :: Error When Adding Cloud Node?

Jan 4, 2010

First time setting up Ubuntu Cloud system.

I get error during install when searching for and trying to add node from the Cloud Controler: New node found on 192.168.1.182:add it? [Yn] y Connecting to 127.0.0:8774...failed: Connection refused. Error: you need to be on the CC host and the CC needs to be running.

View 4 Replies View Related

CentOS 5 :: Implementing A Two-node Cluster With Shared Storage (GFS) And IP Address?

Jul 8, 2009

I am working on the beginning of implementing a two-node cluster with shared storage (GFS) and IP address. Both machines are virtual on VMware ESX 3.5, that should not make a difference, but that is the background.current status is that I have a single node cluster built with only the IP address configured within the cluster. The issue that I am having is that I have configured a service to contain only the IP address resource, however, when I go into cluster management that "service" does not register. As such, I cannot bring it online, ping it, etc. below is my cluster.conf configuration:

<?xml version="1.0"?>
<cluster alias="tmbackup" config_version="10" name="tmbackup">
<quorumd device="/dev/sdb1" interval="1" min_score="3" tko="10" votes="3">

[code]....

View 1 Replies View Related

Software :: HDD Recovery (Invalid B-Tree Node Size Error)

Mar 25, 2010

Basically, I am running diskwarrior on my mac now to try and fix invalid b-tree node size error. It is my boot drive and therefore cannot get onto the operating system. I have used live linux distros to try and get at the hdd that way, I have installed HFStools and testdisk and tried almost everything possible from within Linux itself to recover this HFS+ drive, but they were all a no go. My last resort was to try diskwarrior and it is rebuilding the drive now, well it has been for about 38 hours now but doesn't give any indication as to how long it will take.

I am pretty sure I am right in thinking data is never truly deleted until it is overwritten, is that right? Even if reformatted, would I be correct in thinking that if I were to format the drive and install linux on my machine, some of the data on the Drive would still be technically recoverable, even though the file system has changed? i.e. from HFS to EXT3. If this is technically possible what software would I need to try and recover files from a drive that has been formatted from HFS+ to EXT3?

It doesn't matter if I cant get it all, or might be more hassle than its worth, my girlfriend has another laptop to use for now and is in no hurry and I like getting to play around with Linux. The data is somewhat important but not enough so to go through the whole sending off the drive and paying hefty fees, it would be very nice to get back and I enjoy working in linux and learning more about software/hardware etc.

View 1 Replies View Related

Server :: Unable To Use Northbridge EDAC Amd64 On HDAMA Mobo?

Oct 1, 2010

I have a 'Rackable Systems' server with an HDAMA mobo - Dual CPU Opteron 250 2.4GHz with 2x memory modules per CPU. It seems to run fine, but it hangs every hour or so! I am running Ubuntu 64-bot 10.10, which is currently a beta release so I haven't discounted that as the problem yet, but suspect it unlikely. However, I am downloading 10.04 as I type...dmesg spits out lots of awful messages like these:

[ 1314.920127] EDAC MC1: CE - no information available: amd64_edacError Overflow
[ 1315.920047] Northbridge Error, node 0, core: 0
[ 1315.920060] ECC/ChipKill ECC error.
[ 1315.920066] EDAC amd64 MC0: CE ERROR_ADDRESS= 0x1484410
[ 1315.920082] EDAC MC0: CE page 0x1484, offset 0x410, grain 0, syndrome 0x11c1, row 0, channel 0, label "": amd64_edac

(there are variations on the node, core, address, offset an syndrome etc.)I have tried swapping CPUs over and running with only CPU.I have also swapped all the memory around in almost every permutation.Another worrying symptom is that when I run memtest86+ from a boot disk, it shows zero errors up until the point where the server turns itself off without warning - it hasn't yet completed the test...

View 2 Replies View Related

CentOS 5 :: Eucalyptus Node Controller And Cloud Controller Apps Won't Bind To IP

Jun 18, 2010

I can not get the node or cloud controllers to startup using the init.d scripts. I have a fresh install of CentOS 5.4 with Eucalyptus 1.6.2 I have compiled Eucalyptus and all packages using the RPM supplied from Eucalyptus and utilizing yum installer. I do not currently have any processes or applications listening currently on the ports on the boxes as well. I think it may be a permissions issue or something because I get a "permission denied error", but I am not sure if it is Eucalyptus or CentOS. It looks as if it is not binding to the address on the interface of the NIC. It may be something else however. I have the Node controller, Cloud controller, and Cluster controller on seperate physical boxes. When I try to run either the cloud controller or the node controller I get this message:

Cloud Controller:

[root@cluster-cont ~]# /etc/init.d/eucalyptus-cc start
Starting Eucalyptus cluster controller: (13)Permission denied: make_sock: could not bind to address [::]:8774
(13)Permission denied: make_sock: could not bind to address 0.0.0.0:8774

[code]...

View 2 Replies View Related

CentOS 5 :: Figure Out The "Device" Name From The "Sysfs" Node?

Jun 12, 2011

How can I figure out the "Device" name from the "Sysfs" node? e.g. [root@baba 0000:00:01.0]# lspci | grep -i Ethernet
00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network Connection (rev 02)

View 1 Replies View Related

Ubuntu :: Debug 10.04 UEC Node?

Jun 3, 2010

I installed a Node UEC from 10.04 which does not boot(stops at loading apache) and the logs are empty.

View 1 Replies View Related

Software :: Ssh With In The Node Is Not Working?

Feb 3, 2011

I have a rhel cluster with two nodes. cluster is working fine.But suddunly for today morning im not able to ssh to once of the nodes with follwing error.ebug2: we sent a hostbased packet, wait for replyConnection closed by 10.125.104.162After some not able with in the node and after some time second node also started behaving like this. Now ssh with in the nodes and between nodes is not happening. But i am able to putty session.Error from /var/log/messages.vsftpd(pam_unix)[14260]: authentication failure; logname= uid=0 euid=0 tty= ruser= rhost=

View 1 Replies View Related

Red Hat :: Piranha With 2 Node Cluster?

Oct 2, 2009

after few days of hard work about redhat cluster and piranah, i have done 90% hopefully,but i am stuck with iptables rulesi am attaching full piranah server screen shot of my network .please have a look and please tel me, what else to do in piranah server ...a) from firewall (Lynksys) what ip shall i forward port 80 ?? ( 192.168.1.66 or 192.168.1.50 ??)b) Currently its looks like http request is not forwarding from Virtual server to real server , what iptables rules shall i write ?(Please have a look to iptables rules)also, this link for my piranah server setupi guess i am stuck somewhere where i need some experts eye to catch it upso please look at the all the pictures , ifconfig and iptables rulesifconfig :

ifconfig
eth0 Link encap:Ethernet HWaddr 00:0F:3D:CB:0A:8C
inet addr:192.168.1.66 Bcast:192.168.1.255 Mask:255.255.255.0

[code]...

View 14 Replies View Related

Ubuntu Servers :: Can VM's Use More Than One Node In Cloud

Jun 7, 2011

Can a Virtual Machine, instantiated in the Ubuntu Cloud (UEC), use resources from more than one node?

View 5 Replies View Related

Networking :: How To Remove /dev/drbd0 From A Node

Jan 21, 2010

I've configured drbd with Heartbeat. The nodes are connected at first, but when I issue "/usr/lib/heartbeat/hb_standby" on node 1, node 2 won't take over, and after a while, both nodes become "WFConnection".

I figure this happens because I created "dev/drbd0" on both nodes! Do you know how to remove it? I googled it but got nothing.

View 1 Replies View Related

Server :: Oracle CRS Rebooting Node?

Nov 2, 2010

I am running a 5 node 11g RAC cluster with cluster ready services. The hardware is HP BL460g6 with virtual connect modules on each chassis. The OS is RHEL5.5, 2 Qlogic 4G fibre cards, 2 Broadcom NICs bonded, and the backend disk is all Hitachi 15k. The issue that I'm seeing is periodically each day there are spikes in I/O waits for disk writes, and sometimes that will cause a node or 2 to reboot when CRS can't communicate with the other nodes. Both nodes that have been evicted, are on the same chassis. What I've checked is, the backend storage is not seeing ANY high utilization, in fact it's running at less than 10% all the time. The network is 10G, and is not showing errors on the switch, or on the blades themselves. The redo logs have a 10M buffer cache, and are running on ASM disk. What other information would be beneficial to check? I am seeing nothing in the logs as to any errors, or waits for disk writes. I believe it's a software issue, but am lost as to how to prove it. These aren't the only nodes experiencing periodic high I/O waits, and it happens at the exact same time on all systems, whether they run ASM or are part of a cluster.

View 1 Replies View Related

Server :: Uname -n Node Name Change?

Feb 7, 2010

I want to change my servers node name which is the output of "#uname -n"Server is CentOS 5I searched but couldn't find. There was some search results about /etc/nodename but I don't have a file at that path. Also some said uname -S which doesn't work.

View 8 Replies View Related

Software :: 2nd Node Tried To Take Over Resources But Failed

Sep 9, 2010

On a HA Cluster, the 2nd node noticed that the 1st node was down, but the first node wasn't down. This resulted in that the 2nd node tried to take over the resources but failed, because the resource was stil in use by the first node. This caused that the first node was left behind in a fuzzy state. I had no other choice to kill the heartbeat service, and reboot the server to solve the issue. There where no network issues, all hardware is ok. Are there any bugs know? Is there a way to avoid it from happening again?

View 1 Replies View Related

Software :: Change The Algorithm Of A Node In Ns2?

Jan 3, 2011

there are 3 nodes..A,B and C. Node A wants to send information to Node B but it does so by sending it to Node C first which then sends to B. And similarly Node B sends to A. In this simultaneously C doesn't send to both A and B. This is the in built algorithm..but i want to change it to: A and B send their packets to C and C sends both these packets to A and B..by ORing... In the receiver side...node A can receive the wanted packet and also B. Where do i change the algorithm?

View 7 Replies View Related

CentOS 5 Hardware :: Error Installing CentOS 5.2 On ICH7R With 4 SATA HDDs RAID 5?

Feb 15, 2009

The installer can't see my raid controller (I assume) as I'm getting the following error:"Error opening /dev/mapper/isw_jbhgjgjj_Vol0: No such device or address"It just sees them as 4 individual drives: sda, sdb, sdc and sdd.Please note that I have set up the RAID 5 in the controller bios interface and the image name is Vol0, which it seems that it tries to load but for some particular reason it can't.I have also tried different bios settings and nothing worked.

View 3 Replies View Related

CentOS 5 :: Booting CentOS V5.2 Fails With GRUB Error 13: Invalid Executable Format?

May 5, 2009

I'm trying to install a dual booting machine with OpenSUSE v11.1 32bit and CentOS v5.2 64bit. I installed OpenSUSE first and allowed it to install and configure grub in the MBR and after that I wanted to proceed with CentOS v5.2. The installation went fine with two notable exceptions:- when I had to configure grub installation parameters, CentOS offered me only 2 solutions: either install it on the MBR of the first hard disk or not installing it at all. Other distributions are more flexible allowing you to install it in the boot sector of the root partition for example. Because I didn't want to ruin the existent grub onfiguration, I reluctantly accepted not to install it for CentOS assuming that I could manually configure the entry later in grub's menu.lst file.

- when I was presented with the options for software components installation, I've clicked on virtualization category/function because I intend to use the machine as a VMware host. There was no guidance on screen at that point and I blindly assumed that by choosing the virtualization function I would get necessary tools and drivers that will help me further on. It seems that this was a wrong move as you can see it below.

After completing the installation, I tried to search for a template or guiding on how the menu entry in menu.lst should look like but the grub directory was empty, not surprisingly because I've told CentOS earlier not to install it. Using the files in the /boot directory from the CentOS installation I tried to improvise a menu entry but it's not working. The boot stops with famous Error 13: Invalid or unsupported executable format. Using the file command to check what kind of files I'm trying to load as kernels I'm getting :

marte:~ # file /mnt/vmlinuz-2.6.18-92.el5xen
/mnt/vmlinuz-2.6.18-92.el5xen: gzip compressed data, from Unix, last modified: Tue Jun 10 19:20:51 2008, max compression

[code]....

View 2 Replies View Related