Debian Configuration :: Fault Tolerance / High Availability
Aug 19, 2015
I work for a data center where we use a whole lot of VMware's "Fault Tolerant" Solution. As a personal project, I want to piece together an alternative solution out of open source software.
Here's my punchlist:
*Configure Debian (or build from scratch) to be as slimmed down as possible so that I may call it "Just Enough OS" or "baremetal"
*Install and configure Xen with XenMotion
*Prove that both the physical and virtual fail-overs work as intended
*Prove the automatic repair of failed virtual machines.
Just for the learning sake, I was going through this document for Bonding of Two Ethernet Cards for Fault-Tolerance (active-backup). Here, /etc/modprobe.conf is as
Code:
alias bond0 bonding options bond0 mode=1 arp_ip_target=192.168.1.1 arp_interval=200 primary=eth0
Now, I just wanted to to know,
- What parameter is arp_ip_target and why it is used.
- Why the IP is 192.168.1.1, whereas the IP in /etc/sysconfig/network-script/ifcfg-bond0 is 192.168.1.5.
- Can I use the below /etc/modprobe.conf ?
Code:
alias bond0 bonding options bond0 miimon=100 mode=1 primary=eth0
I have 2 web servers running apache hosted at 2 data centres on 2 different IP ranges.The 2 servers are an exact clone of each other hosting www.example.com.What I am trying to achieve automatic failover. Say my first data centre gets wiped out, how would customers reach my website on my second server in the second data centre by still typing www.example.com?The aim is for the customer not to notice any difference.
Now that I have setup a proxy server, as a next step I want to run it in fail-over high availability mode, so that if one proxy is down due to any reason, second proxy should automatically be up and start serving requests.
Is there a way to create a high availability environment between two CentOS machines? I don't mean just the HTTP service or just one other thing. I need the entire server synced in real time ready to take over if the next goes down.
I'm trying to configure an ISCSI/DRBD high-availability cluster and I'd like to know what is the best option between OpenAIS and Heartbeat. I've seen they both are included in Centos Repos, yet OpenAIS requires installing 2 addition repos to install Pacemaker (EPEL and Clusterlab repos).
I am trying to setup a High-Availability HTTP Load Balancer With HAProxy & Heartbeat using the below links.
I have all RHEL 5.4 servers hosted on VMWare.
[url] [url]
This is the scenario, as given in the links as wells as my setup.
Load Balancer 1
Load Balancer 2
Web Server 1
Web Server 2
I have followed all the steps mentioned in the links religiously except the 2.2 here, in which it is asking to configure the vhosts. I could not really understand , what is to be placed in /etc/httpd/conf.d/vhosts.conf file and in which Web Server.
Due to this step only, I think I am failing in Failover test given in Point 4.1 here. I am able to open the webpage by [url] which gives the content of Web Server 1 (http1.example.com). But, when I try to shutdown the http service (to check failover), it does not shows the contents of Web Server 2 (http2.example.com)
Although, I am able to succeed in Failover Test 4.2, in which shared IP 192.168.0.120 switches when I try to start/stop the any of the Load Balancers.
I am kinda stuck while providing solution for the above problem. I have achieved the fail over using keepalived but not sure how can we replicate the data from one server to other seamlessly and have them in sync with each other. My prime requirement for this project is end user should not notice the fail over and replicated copy of data should be available on the secondary as well.
I have a relative fresh install of jessie in which I face a high cpu usage of java (top shows about 165% CPU and 12% MEM). The problem occurs right after booting the computer. These values stay constantly high for days if I leave the box running. This happens even if the computer is just sitting there without doing anything.
I have to kill java to go back to normal. So, when I do a Code: Select allkillall -KILL java the problem goes away. After that it doesn't reappear and I can use all apps installed without a problem.
Currently I am based on openjdk Code: Select allupdate-alternatives --display java java - auto mode link currently points to /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java - priority 1071 slave java.1.gz: /usr/lib/jvm/java-7-openjdk-amd64/jre/man/man1/java.1.gz Current 'best' version is '/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java'.
But I have also tried the SUN version with the same result.
Where to look to find more information on what exactly java app is using so much resources and how I can solve it? I guess I could just put somewhere in rc.d a kill java command and forget about it but I would really like to find out whats going on...
My sid running fine for months until recently. It is now freezed more frequently. Brief looking at process I found kswapd0 constantly consume 50% cpu right after booting in, even when swap usage is only 3%.
I added my linux server to a windows AD using winbind / samba. Everything worked just fine. After changing the OS to Debian lenny x64 I get a "segmentation fault" when trying to change user passwords. I am using the exact same configuration, on my 32 bit Server everything works.
debian:~# passwd <user> sgmentation fault tail /var/log/syslog: kernel: [689689.005934] passwd[11209]: segfault at 0 ip b7b84418 sp bfc37fc0 error 4 in pam_winbind.so[b7b7e000+b000] Debian Lenny 5.0
I am noticing some rather high CPU temps while using Debian 7.8 KDE on a Lenovo X200 laptop.
Even when its not doing anything its getting as high as 70C sometimes. I opened it up and went in with a can of compressed air, it went down a little to 36C now its back to 70c the day after.
Then I realised the fan was hardly spinning is there any way I can control how much (or little) the fan spins at certain temps?
My guesses are it could be done with a bash / shell script in the kernal loaded at booting up. But am unsure.
VLC was behaving weirdly recently and when I've tried to run it with primusrun command (since I have optimus card) it gave me a segmentation fault
Code: Select allVLC media player 2.2.1 Terry Pratchett (Weatherwax) (revision 2.2.1-0-ga425c42) Segmentation fault
I've read on google that issue has been solved by a few people from updating the microcode, but I don't even understand what microcode is, I'm also not sure whether I should install amd64 or intel package for it?
Here is my lscpu
Code: Select all$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3
My laptop's been locking up in Linux (Ubuntu, Backtrack, Puppy) periodically for a while now. When it locked up, it was always immune to the magic of SysRq, which I thought might indicate a hardware problem. It became so bad that I had to stop using the laptop.
Today, when I turned it on and tried to boot into Fedora 12, I got the following error (just once, it just locked up at various points during the splash screen after this once):
double fault: 0000 [#1] SMP last sysfs file: CPU 0 odules linked in: Pid: 1, co m: swapper Not ta nted 2.6.32.11-99.fc 2.x86_64 #VGN-T 250N RIP: 0010:[<ff
All the seemingly missing letters were really missing, not my typos.
As you can see, kernel version is 2.6.32.11-99.fc12.x86_64 and my laptop is a Sony Vaio TZ 250N (Core 2 Duo ULV 1.2GHZ). Note that with the other remaining kernels from the updates, nothing ever happened other than the locking up. The core temperatures hover pretty high, about 55-60C peak but this is still below the critical temp. Memtest came up clean when the problem first started happening.
I have a debian system with the following version you can se bellow. My problem is that one one single core of 4 it's running 100% all the time and i cant seem to find out why. The load is also high(load average: 0.91, 0.75, 0.40) because of this. This keeps happening even after reboots. The system if freshly installed twice and the same problem occours as soon as it boots. Something called kworker is running and causing the load / CPU load, is that normal for a fresh install??
root@Cyberdyne:/# lsb_release -da No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 8.2 (jessie) Release: 8.2 Codename: jessie
I'm running debian 8, no problems, after I recently upgraded some packages, skype did not open, I used terminal to open it and it gave an output saying
Code: Select allSegmantation fault
Then i opened it as super user and it gave this output
Code: Select all(process:1500): GConf-WARNING **: Client failed to connect to the D-BUS daemon:
Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
I noticed weird behavior with some of the installed applications. It all started with Guake when I wanted to use it. My shortcut key is F4, I hit F4 and nothing showed up. So I run Guake in terminal, everything fine. Then I hit F4 key and after few seconds Guake crashes with message "Segmentation fault". Nothing more, just that. Then I noticed that Audacity does the same, Gdebi too and even Reportbug app does that (I'm not sure about "segmentation fault", but the app crashes). I don't remember doing anything special or installing any app. All I can think of is this message when booting the system
Code: Select allsystemd[1]: Job rpcbind.service/start deleted to break ordering cycle starting with basic.target/start
No idea, if there is any connection. Is there anyway, how I can maybe see more behind that "Segmentation fault" error? I found history.log in /var/log/apt/, so I'm thinking about looking into it and see if I really did not installed anything. I have a little feeling that it might have started after some update, but I'm not sure about that..
Synaptic does not run. When I start it from Menu/system/administration/synaptic package manager , it asks for root password, then crashes. Nor does aptitude from the root terminal.localhost:/home/user# aptitude
Running programs in X (WindowMaker) as another user.
Until recently (Debian Stretch) I was able to run graphical programs in X as a second user, like so:
- log into X as user1 - run "xhost+" - run "su - user2" - run any graphical program (gthumb, konquerer, ...) as user2
Now when I try to run a program as user2, I get
Code: Select all user2@localhost:~$ gthumb error: XDG_RUNTIME_DIR not set in the environment. Segmentation fault
XDG_RUNTIME_DIR for user1 points to /run/user/1000 which I believe gets created by pam_systemd when logging into X. However, doing "su - user2" does not create /run/user/1001. Why not? And how do I get on-the-fly user switching working again?
Related to this, /run/user/1001 not being created on user switch has always caused me some grief b/c audio would not work for user2. But I could work around that by piping pulse through a socket. But now graphics are also broken and that I can't seem to work around.
I have discovered that epiphany cannot be launched. When I try, either from the gnome desktop or from a terminal, I get a box which reports that epiphany has terminated unexpectedly, and would I like to recover the last session. If I click on either from the desktop, nothing happens. Doing it from a terminal, I get the following:
Skype was running okey when I was using it on Lenny and was still running okey on squeeze (when I upgraded) until yesterday when it won't start anymore after login. It simply crashes. I run it on terminal to capture the error and the result is:
I'm trying to use GPA - GNU Privacy Assistant - installed from debian repository. When I start the program, comes an error message, that library GPGME sent back unexpected fault and this fault is some general error in Assuan. Program itself has then a lot of problems during its running. "The GPGME library returned an unexpected error. The error was:General Assuan error This is probably a bug in GPA. GPA will now try to recover from this error."
I would like to add a new user with useradd (on Debian 4.0), I get the message Segmentation fault. I made a strace, that says: access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) I have read maybe the libc6 is missing or damaged, so I installed it again (apt-get install libc6, install was successfull), but the problem is still there. I touched it (touch /etc/ld.so.nohwcap)
my server normally has low load of 0.80 now it is 4.0 'top' shows that programs are not using high cpu, but %wa is very high (80-90%). it has been like this for hours now.how can i find out which program has caused this?
A few days ago I bought a Raspberry PI B with Debian Wheezy (7.0 - I think) on it. Before installing a media centre on it I wanted to do some basic configuration/upgrade and decided to upgrade to Debian Jessie. I followed the instructions provided on [URL] .....
Before moving to Jessie I have upgraded the original Wheezy; after the upgrade the version was 7.8. Everything went well till I executed "apt-get dist-upgrade". Errors where generated. As suggested I tried the "apt-get -f install"; but it did not go smoothly either. However, so far, I am accessing the desktop and everything seems fine (although I did not do anything fancy yet). The version recorded is 8.0. Thus, should I worry about the error messages generated?
Please find the log file here: [URL] ......
Note that I put the log file on Google Drive because each time I clicked on "add file" in the "Upload Attachment" tab when editing this message I got:
Internal Server Error: The server encountered an internal error or misconfiguration and was unable to complete your request. Please contact the server administrator, forum-admin@forums.debian.net and inform them of the time the error occurred, and anything you might have done that may have caused the error. More information about this error may be available in the server error log.
I recently decided to build up a Linux computer with the following configuration: - Motherboard : MSI KT4 Ultra - CPU : Athlon XP 2000+ - RAM : Kingston ValueRAM (2 DIMMs, I put them in all possible arrangements) I doubt other compoment details are relevant, but if you do think so, feel free to ask.
I installed Debian 5.05 on that machine ; it worked all fine.
However, given that my mobo would not support wake on lan (and that I needed it given that my goal was to build some sort of server), I searched for a new one that did, I then installed an ASRock K7S41GX.
Since doing so, I have not been able to boot Linux ; after doing some fixing on my part, I have come to the point where I have the very same error repeating times and times again every time I try to boot ; here's the error message I get and how I get to it.
GRUB loads like a charm, then Linux starts booting and stuff happens :
Checks filestystem, initrd and stuff (I actually don't understand that part, but it seems to work OK).
Where "########" stands for a series of eight hexa chars (I assme these might be memory adresses ?).
A few more details : - The ########'s are NOT the same on each attempt to boot. - When I try to reinstall Linux, I also got various error messages (Mosty "PANIC: attempted to kill init" or "PANIC: attempted to kill the idle task") - Rescue mode fails the same way re-install does.
- For some reason I just can't explain I ONCE happened to manage to start the installer process, which unfortunately ended up hanging on "Detecting Network hardware". (I attempted to disable the onboard network device to no avail).