DRBD and Heartbeat for high availability on Linux

June 18, 2007

I’ve been trying to get a HA solution put together for one of our software projects here at EMC and I figured I’d share the configuration of these two products in the environment that we’re using. I have to write the documentation for it anyway, so I might as well post it here for everyone else to see and learn from first ;)

We are going to configure 2 machines to be in a Active/Passive failover situation, which means that if the primary machine dies, the secondary will take over its identity and continue functioning as previously.

Primary: lava2042 (10.5.140.42) (192.168.1.1 for crossover interface)
Secondary: lava2138 (10.5.140.138) (192.168.1.2 for crossover interface)
HA-address: lava2222 (10.5.140.222)

Configuring heartbeat

Step 1.
Install Heartbeat and DRBD on BOTH machines that you are planning on configuring. This should be a very straightforward step and I’m not going to go into detail.

Step 2.
We’re going to need a way to connect the machines, you can use either a crossover cable from an additional ethernet port to the other or you can use a serial cable. In this example I’m using a crossover cable.

Step 3.
Now we’re going to configure the /etc/ha.d/ha.cf file for our machine. Here is what I’ve put into the /etc/ha.d/ha.cf file ON EACH MACHINE:
bcast eth1
keepalive 2
warntime 10
deadtime 30
initdead 120
udpport 694
auto_failback on
node lava2042
node lava2138

Check this page if you have trouble or are using a serial connection instead of a crossover cable. It has instructions on how to configure this file for a serial interface.

Step 4.
Now configure the /etc/ha.d/authkeys file ON EACH MACHINE for what kind of security and file checking you want, I don’t care about security in this example so I put this is the file since it’s the fastest:
auth 2
2 crc

(See here for more information)

We’ll also need to configure the /etc/ha.d/haresources file, but we won’t do that until we get DRBD working correctly.

Configuring DRBD

Step 1.
The /etc/drbd.conf file needs to be configured. It should already have an example setup in the file. I used the already existing resource r0 and edited the nodes. Inside the “resource r0 {” bracket there should be a part that says “on <something>”. Here is what I put for my 2 nodes:
on lava2042 {
device /dev/drbd0;
disk /dev/sda1;
address 10.5.140.42:7788;
meta-disk internal;
}

on lava2138 {
device /dev/drbd0;
disk /dev/sda8;
address 10.5.140.138:7788;
meta-disk internal;
}

Now let me give a little background. I had already made the /dev/sda1 partition on lava2042 and the /dev/sda8 partition on lava2138, each 1 gig to store the data that was going to be shared. /dev/drbd0 is the device that will actually be mounted and read from. Other than that, I left the entire file to be it’s defaults. Make sure to comment out any other resources unless you need more than one filesystem replicated.

Step 2.
Make sure to load the drbd module by doing a modprobe drbd and check the dmesg command to make sure the output looks correct (Sorry, I don’t have what it should look like, I’ll keep better notes in the future).

Step 3.
Now we need to initialize our metadata for DRBD. We do this by running this on EACH machine:
drbdmeta create-md r0
Where r0 is the name of the resource from the /etc/drbd.conf file. You should now be able to run the following on each machine:
drbdadm up all
After running these two commands, you should be able to check dmesg and /proc/drbd to see the status of your filesystem.

Step 4.
The next step is to force one of the machines to be the primary and create a filesystem. In this case I’m choosing lava2042 as the primary, so I will run this on the machine:
lava2042# drbdsetup /dev/drbd0 primary -o
This will do the initial sync between the machines, you should only need to do this once. After that, run this command:
lava2042# drbdadm primary all
To force lava2042 into the primary state and make /etc/drbd0 usable. From here you can create a filesystem by doing a:
lava2042# mkfs.ext3 /dev/drbd0 (or whatever filesystem you want)
And mount the filesystem to check it out (make sure to unmount it after you’re done)

You should now be able to do a drbdadm primary all on either machine (while in a Secondary/Secondary state (check /proc/drbd)) and mount the filesystem

Step 5.
Okay, now let’s drop back into secondary mode for lava2042 by doing this:
lava2042# drbdadm secondary all
The /proc/drbd file should look something like this:
version: 8.0.3 (api:86/proto:86)
SVN Revision: 2881 build by root@lava2138, 2007-06-18 09:50:33
0: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r---
ns:316952 nr:1221300 dw:1222380 dr:346211 al:8 bm:107 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:81456 misses:98 starving:0 dirty:0 changed:98
act_log: used:0/257 hits:262 misses:8 starving:0 dirty:0 changed:8

(Important part bolded) The filesystem needs to be in a Secondary state for both machines in order for heartbeat to work properly

And now we’re going to edit the /etc/ha.d/haresources file to take care of sharing the filesystem. Here’s what I have in the file:
lava2042 drbddisk::r0 Filesystem::/dev/drbd0::/opt/EMC::ext3 10.5.140.222 httpd
Let’s go through it line by line:
lava2042 – the machine that will be the primary node
drbddisk::r0 – activate the r0 resource disk (make sure r0 corresponds to whatever your resource is named)
Filesystem::/dev/drbd0::/opt/EMC::ext3 – mount /dev/drbd0 on /opt/EMC as an ext3 filesystem
10.5.140.222 – the IP address for our solution (see the beginning of the post)
httpd – the service we’re going to watch over and take care of, in this case httpd (which wasn’t really what I was configuring, but it’s the easiest to show as an example)
Don’t forget this file has to be the same on BOTH MACHINES.

Step 6.
Make sure heartbeat and the service(s) you’re watching DO NOT start at boot, otherwise things get really ugly if when you screw up:
chkconfig heartbeat off
chkconfig httpd off
/etc/init.d/httpd stop
(on both machines)

Step 7. (The cross your fingers step)
Alright, it’s finally time to test your failover configuration. First, we need to start heartbeat on the primary machine:
lava2042# /etc/init.d/heartbeat start
Then, start it on the secondary machine
lava2138# /etc/init.d/heartbeat start

You should now be able to ping the cluster IP (lava2222 or 10.5.140.222). You can also check that the /dev/drbd0 filesystem is mounted on the primary node using df. Check the /var/log/messages file on either machine for debugging information.

The moment of truth
Go to your primary node and yank the power cable out of the back. Head back to your machine and carefully watch the /var/log/messages file on the secondary node. You should see information about the link being down, the drbd having trouble accessing the filesystem, then heartbeat should kick in and start taking over, mounting the filesystem and finally starting your httpd service. Congratulations, you have now successfully failed over.

If you have an error, check the error messages and see if you can figure out what to do, if you need any help leave a comment or email me and I’ll try and help. Hopefully this helps somebody as this took me quite a while to figure out, having never worked with either piece of software.

Additional links:
Information mostly pulled from:
http://linux-ha.org/GettingStarted
http://www.linux-ha.org/DRBD/GettingStarted
http://www.linux-ha.org/DRBD/HowTo

P.S. Ralf Ramge emailed me an updated version of his bash zfs backup script. I am still working on getting it put together to post. Thanks for the email Ralf

posted in drbd, failover, heartbeat, linux, tutorials, work by Lee

8 Comments to "DRBD and Heartbeat for high availability on Linux"

  1. ragini wrote:

    i want to setup two drbd resources on one machin having one primary server and two secondary server and both secondary should replicate same data.
    can you help

  2. Fernando Serer wrote:

    Good article and thank you for sharing your experience!

    I’m interested in setting up something similar, but the crossover cable is connected to Gb NIC’s and the other local network NIC’s are connected to a 100Mbps switch.

    Do you know if it’s possible (or if it’s safe) let drdb replicate through the crossover cable instead of the 100 Mbs nics?

    In your example, the configuration of /etc/drbd.conf would be:

    for lava2042
    address 192.168.1.1:7788;

    and for lava2138
    address 192.168.1.2:7788;

    I don’t know if this would be possible or safe to do. Thanks!

  3. Lee Hinman wrote:

    Fernando: You should be able to let DRBD replicate through a crossover cable. I am fairly sure that it will be okay as long as both machines can see each other.

    In the case of replicating over large distances, however, latency is going to be a bit problematic and you might not see the performance you are hoping for (depending on disk read/write numbers). A direct-connect crossover is going to be faster compared to a switch because of the lack of switch overhead.

    Give it a shot and let me know how it work!

  4. Fernando Serer wrote:

    Thanks Lee! I will try it this weekend :)

    what i would like to be able to do is to have 2 servers, with apache in server A and mysql in server B.

    They are connected to internet with eth0 and crossover linked with eth1.

    Now apache is using the crossover link to read/write in mysql. But i’ve been reading that i could use this 2 servers to achieve high availability and each one would fail over the other service.

    Is the active/active option in this link:

    http://www.linux-ha.org/DRBD/GettingStarted

    I’ve been googling on this option and I founded no more information than that page. All the information i found is about having a spare server waiting (like yours) and I would like to go one step beyond :)

    I hope to be able doing it and i’ll tell you!

    thanks again

  5. Glynn Bird wrote:

    A very useful tutorial. Many thanks. One small typo:

    drbdisk::r0

    should read

    drbddisk::r0

    on your line-by-line explanation.

  6. Lee Hinman wrote:

    I’ve updated the post with the correction, thanks Glynn!

  7. Douglas Lochart wrote:

    First of thanks for the Tutorial it has been very helpful. If possible I would like some clarification. After you specify the ip addresses for eth1 (192.168.1.x) I see no mention of them in the config files save your mention of eth1 as the value for the bcast parameter. I see that this is the interface heartbeat uses to send the udp heartbeat packets. So the eth0 interface is used for DRBD syncing and eth1 is used for heartbeat. Since like your example I have a switched cable on my eth1 interface does it make sense to also add eth0 to the bcast line ?

    bcast eth1 eth0

    Or is that not necessary?

    Also I am syncing a 1 terrabyte partition over a gigabit switch as of now. Should I tweak the syncer parameters up to 100M or more? Also the al-extents is set to 257 but the comment in the default config file says that this will handle a 1 gig active set. I see no definition of what that is. Since I ahve such a large partition to sync should I increase that ?

    Thanks

    Doug Lochart

  8. ilyas wrote:

    It’s great Lee! i’ve been trying to build a clustered server with two nodes. i have a problem, how whether the electricity comes down on primary node suddenly(e.g disaster or other).. is the synchronization happened..? when does it happen? how to solve it..?

    thanks a lot!

 
Powered by Wordpress and MySQL. Theme by Shlomi Noach, openark.org