• Immutable Page
  • Info
  • Attachments


marcelo The old model of high availability is "fault tolerance" usually hardware-based.
marcelo Expensive, proprietary.
marcelo This old model goal is to have the hardware system running
andres plas
riel so basically, a single computer is an unreliable piece of shit (relatively speaking) ...
riel ... and High Availability is the collection of methods to make the job the computer does more reliable
riel you can do that by better hardware structures
riel or by better software structures
riel usually a combination of both
marcelo the Linux model of high availability is software based.
marcelo Now let me explain some basic concepts of HA
marcelo First, its very important that we dont rely on unique hardware components in a High Availability system
marcelo for example, you can have two network cards connected to a network
marcelo In case one of the cards fail, the system tries to use the other card.
marcelo A hardware component that cannot fail because the whole system depends on it is called a "Single Point of Failure"
marcelo SPOF, to make it short. :)
marcelo Another important concept which must be known before we continue is "failover"
marcelo Failover is the process which one machine takes over the job of another node
riel "machine" in this context can be anything, btw ...
riel if a disk fails, another disk will take over
riel if a machine from a cluster fails, the other machines take over the task
riel but to have failover, you need to have good software support
riel because most of the time you will be using standard computer components
marcelo well, this is all the "theory" needed to explain the next parts.
riel so let me make a quick condensation of this introduction
riel 1. normal computers are not reliable enough for some people (like: internet shop), so we need a trick .. umm method ... to make the system more reliable
riel 2. high availability is the collection of these methods
riel 3. you can do high availability by using special hardware (very expensive) or by using a combination of normal hardware and software
riel 4. if one point in the system breaks and it makes the whole system break, that point is a single point of failure .. SPOF
riel 5. for high availability, you should have no SPOFs ... if one part of the system breaks, another part of the system should take over
riel (this is called "failover")
riel now I think we should explain a bit about how high availability works .. the technical side
riel umm wait ... sorry marcelo ;)
marcelo ok
marcelo Lets talk about the basic components of HA
marcelo Or at least some of them,
marcelo A simple disk running a filesystem is clearly an SPOF
marcelo If the disk fails, every part of the system which depends on the data contained on it will stop.l
marcelo To avoid a disk from being a SPOF of a system, RAID can be used.
marcelo RAID-1, which is a feature of the Linux kernel...
marcelo Allows "mirroring" of all data on the RAID device to a given number of disks...
marcelo So, when data is written to the RAID device, its replicated between all disks which are part of the RAID1 array.
marcelo This way, if one disk fails, the other (or others) disks on the RAID1 array will be able to continue working
riel because the system has a copy of the data on each disk
riel and can just use the other copies of the data
riel this is another example of "failover" ... when one component fails, another component is used to fulfill this function
riel and the system administrator can replace (or reformat/reboot/...) the wrong component
riel this looks really simple when you don't look at it too much
riel much
riel but there is one big problem ... when do you need to do failover?
riel in some situations, you would have _2_ machines working at the same time and corrupting all data ... when you are not careful
riel think for example of 2 machines which are fileservers for the same data
riel at any time, one of the machines is working and the other is on standby
riel when the main machine fails, the standby machine takes over
riel ... BUT ...
riel what if the standby machine only _thinks_ the main machine is dead and both machines do something with the data?
riel which copy of the data is right, which copy of the data is wrong?
riel or worse ... what if _both_ copies of the data are wrong?
riel for this, there is a special kind of program, called a "heartbeating" program, which checks which parts of the system are alive
riel for Linux, one of these programs is called "heartbeat" ... marcelo and lclaudio have helped writing this program
riel marcelo: could you tell us some of the things "heartbeat" does?
marcelo sure
marcelo "heartbeat" is a piece of software which monitors the availability of nodes
marcelo it "pings" the node which it wants to monitor, and, in case this node doesnt answer the "pings", it considers it to be dead.
marcelo when a node is considered to be dead when can failover the services which it was running
marcelo the services which we takeover are previously configured in both systems.
marcelo Currently heartbeat works only with 2 nodes.
marcelo Its been used in production environments in a lot of situations...
riel there is one small problem, however
riel what if the cleaning lady takes away the network cable between the cluster nodes by accident?
riel and both nodes *think* they are the only one alive?
riel ... and both nodes start messing with the data...
riel unfortunately there is no way you can prevent this 100%
riel but you can increase the reliability by simply having multiple means of communication
riel say, 2 network cables and a serial cable
riel and this is reliable enough that the failure of 1 component still allows good communication between the nodes
riel so they can reliably tell if the other node is alive or not
riel this was the introduction to HA
riel now we will give some examples of HA software on Linux
riel and show you how they are used ...
riel ... <we will wait shortly until the people doing the translation to Espa´┐Żol have caught up> ... ;)
marcelo Ok
marcelo Now lets talk about the available software for Linux
riel .. ok, the translators have caught up .. we can continue again ;)
marcelo Note that I'll be talking about the opensource software for Linux
marcelo As I said above, the "heartbeat" program provides monitoring and basic failover of services
marcelo for two nodes only
marcelo As a practical example...
marcelo The web server at Conectiva (www.conectiva.com.br) has a standby node running heartbeat
marcelo In case our primary web server fails, the standby node will detect and start the apache daemon
marcelo making the service available again
marcelo any service can be used, in theory, with heartbeat.
riel so if one machine breaks, everybody can still go to our website ;)
marcelo It only depends on the init scripts to start the service
marcelo So any service which has a init script can be used with heartbeat
marcelo arjan asked if takes over the IP address
marcelo There is a virtual IP address used by the service
marcelo which is the "virtual serverIP"
marcelo which is the "virtual server" IP address.
marcelo So, in our webserver case...
marcelo the real IP address of the first node is not used by the apache daemon
marcelo but the virtual IP address which will be used by the standby node in case failover happens
marcelo Heartbeat, however, is limited to two nodes.
marcelo This is a big problem for a lot of big systems.
marcelo SGI has ported its FailSafe HA system to Linux recently (http://oss.sgi.com/projects/failsafe)
marcelo FailSafe is a complete cluster manager which supports up to 16 nodes.
marcelo Right now its not ready for production environments
marcelo But thats being worked on by the Linux HA project people :)
marcelo SGI's FailSafe is GPL.
riel another type of clustering is LVS ... the Linux Virtual Server project
riel LVS uses a very different approach to clustering
riel you have 1 (maybe 2) machines that request http (www) requests
riel but those machines don't do anything, except send the requests to a whole bunch of machines that do the real work
riel so called "working nodes"
riel if one (or even more) of the working nodes fail, the others will do the work
riel and all the routers (the machines sitting at the front) do is:
riel 1. keep track of which working nodes are available
riel 2. give the http requests to the working nodes
riel the kernel needs a special TCP/IP patch and a set of usermode utilities for this to work
riel RedHat's "piranha" tool is a configuration tool for LVS, that people can use to setup LVS clusters in a more easy way
riel in Conectiva, we are also working on a very nice HA project
riel the project marcelo and Olive are working on is called "drbd"
riel the distributed redundant block device
riel this is almost the same as RAID1, only over the network
riel to go back to RAID1 (mirroring) ... RAID1 is using 2 (or more) disks to store your data
riel with one copy of the data on every disk
riel drdb extends this idea to use disks on different machines on the network
riel so if one disk (on one machine) fails, the other machines still have the data
riel and if one complete machine fails, the data is on another machine ... and the system as a whole continues to run
riel if you use this together with ext3 or reiserfs, the machine that is still running can very quickly take over the filesystem that it has copied to its own disk
riel and your programs can continue to run
riel (with ext2, you would have to do an fsck first, which can take a long time)
riel this can be used for fileservers, databases, webservers, ...
riel everything where you need the very latest data to work
riel ...
riel this is the end of our part of the lecture, if you have any questions, you can ask them and we will try to give you a good answer ;)
See also http://www.linux-ha.org/
Tell others about this page:

last edited 2009-02-28 11:27:09 by narendramind