#FORMAT IRC The old model of high availability is "fault tolerance" usually hardware-based. Expensive, proprietary. This old model goal is to have the hardware system running plas so basically, a single computer is an unreliable piece of shit (relatively speaking) ... ... and High Availability is the collection of methods to make the job the computer does more reliable you can do that by better hardware structures or by better software structures usually a combination of both the Linux model of high availability is software based. Now let me explain some basic concepts of HA First, its very important that we dont rely on unique hardware components in a High Availability system for example, you can have two network cards connected to a network In case one of the cards fail, the system tries to use the other card. A hardware component that cannot fail because the whole system depends on it is called a "Single Point of Failure" SPOF, to make it short. :) Another important concept which must be known before we continue is "failover" Failover is the process which one machine takes over the job of another node "machine" in this context can be anything, btw ... if a disk fails, another disk will take over if a machine from a cluster fails, the other machines take over the task but to have failover, you need to have good software support because most of the time you will be using standard computer components well, this is all the "theory" needed to explain the next parts. so let me make a quick condensation of this introduction 1. normal computers are not reliable enough for some people (like: internet shop), so we need a trick .. umm method ... to make the system more reliable 2. high availability is the collection of these methods 3. you can do high availability by using special hardware (very expensive) or by using a combination of normal hardware and software 4. if one point in the system breaks and it makes the whole system break, that point is a single point of failure .. SPOF 5. for high availability, you should have no SPOFs ... if one part of the system breaks, another part of the system should take over (this is called "failover") now I think we should explain a bit about how high availability works .. the technical side umm wait ... sorry marcelo ;) ok Lets talk about the basic components of HA Or at least some of them, A simple disk running a filesystem is clearly an SPOF If the disk fails, every part of the system which depends on the data contained on it will stop.l To avoid a disk from being a SPOF of a system, RAID can be used. RAID-1, which is a feature of the Linux kernel... Allows "mirroring" of all data on the RAID device to a given number of disks... So, when data is written to the RAID device, its replicated between all disks which are part of the RAID1 array. This way, if one disk fails, the other (or others) disks on the RAID1 array will be able to continue working because the system has a copy of the data on each disk and can just use the other copies of the data this is another example of "failover" ... when one component fails, another component is used to fulfill this function and the system administrator can replace (or reformat/reboot/...) the wrong component this looks really simple when you don't look at it too much much but there is one big problem ... when do you need to do failover? in some situations, you would have _2_ machines working at the same time and corrupting all data ... when you are not careful think for example of 2 machines which are fileservers for the same data at any time, one of the machines is working and the other is on standby when the main machine fails, the standby machine takes over ... BUT ... what if the standby machine only _thinks_ the main machine is dead and both machines do something with the data? which copy of the data is right, which copy of the data is wrong? or worse ... what if _both_ copies of the data are wrong? for this, there is a special kind of program, called a "heartbeating" program, which checks which parts of the system are alive for Linux, one of these programs is called "heartbeat" ... marcelo and lclaudio have helped writing this program marcelo: could you tell us some of the things "heartbeat" does? sure "heartbeat" is a piece of software which monitors the availability of nodes it "pings" the node which it wants to monitor, and, in case this node doesnt answer the "pings", it considers it to be dead. when a node is considered to be dead when can failover the services which it was running the services which we takeover are previously configured in both systems. Currently heartbeat works only with 2 nodes. Its been used in production environments in a lot of situations... there is one small problem, however what if the cleaning lady takes away the network cable between the cluster nodes by accident? and both nodes *think* they are the only one alive? ... and both nodes start messing with the data... unfortunately there is no way you can prevent this 100% but you can increase the reliability by simply having multiple means of communication say, 2 network cables and a serial cable and this is reliable enough that the failure of 1 component still allows good communication between the nodes so they can reliably tell if the other node is alive or not this was the introduction to HA now we will give some examples of HA software on Linux and show you how they are used ... ... ... ;) Ok Now lets talk about the available software for Linux .. ok, the translators have caught up .. we can continue again ;) Note that I'll be talking about the opensource software for Linux As I said above, the "heartbeat" program provides monitoring and basic failover of services for two nodes only As a practical example... The web server at Conectiva (www.conectiva.com.br) has a standby node running heartbeat In case our primary web server fails, the standby node will detect and start the apache daemon making the service available again any service can be used, in theory, with heartbeat. so if one machine breaks, everybody can still go to our website ;) It only depends on the init scripts to start the service So any service which has a init script can be used with heartbeat arjan asked if takes over the IP address There is a virtual IP address used by the service which is the "virtual serverIP" which is the "virtual server" IP address. So, in our webserver case... the real IP address of the first node is not used by the apache daemon but the virtual IP address which will be used by the standby node in case failover happens Heartbeat, however, is limited to two nodes. This is a big problem for a lot of big systems. SGI has ported its FailSafe HA system to Linux recently (http://oss.sgi.com/projects/failsafe) FailSafe is a complete cluster manager which supports up to 16 nodes. Right now its not ready for production environments But thats being worked on by the Linux HA project people :) SGI's FailSafe is GPL. another type of clustering is LVS ... the Linux Virtual Server project LVS uses a very different approach to clustering you have 1 (maybe 2) machines that request http (www) requests but those machines don't do anything, except send the requests to a whole bunch of machines that do the real work so called "working nodes" if one (or even more) of the working nodes fail, the others will do the work and all the routers (the machines sitting at the front) do is: 1. keep track of which working nodes are available 2. give the http requests to the working nodes the kernel needs a special TCP/IP patch and a set of usermode utilities for this to work RedHat's "piranha" tool is a configuration tool for LVS, that people can use to setup LVS clusters in a more easy way in Conectiva, we are also working on a very nice HA project the project marcelo and Olive are working on is called "drbd" the distributed redundant block device this is almost the same as RAID1, only over the network to go back to RAID1 (mirroring) ... RAID1 is using 2 (or more) disks to store your data with one copy of the data on every disk drdb extends this idea to use disks on different machines on the network so if one disk (on one machine) fails, the other machines still have the data and if one complete machine fails, the data is on another machine ... and the system as a whole continues to run if you use this together with ext3 or reiserfs, the machine that is still running can very quickly take over the filesystem that it has copied to its own disk and your programs can continue to run (with ext2, you would have to do an fsck first, which can take a long time) this can be used for fileservers, databases, webservers, ... everything where you need the very latest data to work ... this is the end of our part of the lecture, if you have any questions, you can ask them and we will try to give you a good answer ;) <> See also http://www.linux-ha.org/ ---- CategoryDocs