UPDATE: High Availability for the Ubuntu Enterprise Cloud (UEC) – Cloud Controller (CLC)

UPDATE
So I finally had the time to write the OCF Resource Agent for the Cloud Controller as promised. It is an early Resource Agent and currently is tested for CLC’s running in Ubuntu ONLY (UEC).

But first what is an OCF Resource Agent? An OCF RA is an executable script that is used to manage a resource within a cluster. In this case, this RA is a script that will manage the resource (Cloud Controller) in a 2 node pacemaker based HA Cluster. The resource starts, stops, and monitors the service (Cloud Controller) when the Cluster Resource Manager (Pacemaker) indicates it to (This means that upstart will NOT start the CLC).

Now that we all know what are OCF RA’s, let’s test it: First download the RA from HERE and move the resource to:

wget -c http://people.ubuntu.com/~andreserl/eucaclc
sudo mkdir /usr/lib/ocf/resource.d/ubuntu
sudo mv eucaclc /usr/lib/ocf/resource.d/ubuntu/eucaclc
sudo chmod 755 /usr/lib/ocf/resource.d/ubuntu/eucaclc

Then, change the cluster configuration (sudo crm configure edit) for res_uec resource as follows:

primitive res_uec ocf:ubuntu:eucaclc op monitor interval=”20s”

And the new RA should start the Cloud Controller automatically and keep monitoring it.

NOTE: Please note that this Resource Agent is an initial draft and might be buggy. If you find any bugs or things don’t work as expected, please don’t hesitate to contact me.

At UDS-M, I raised the concern of the lack of High Availability for the Ubuntu Enterprise Cloud (UEC). As part as the Cluster Stack Blueprint, the effort of trying to bring HA to UEC was defined, however, it was barely discussed due to the lack of time, and the work on HA for the UEC has been deferred for Natty. However, in preparation for the next release cycle, I’ve been able to setup a two node HA Cluster (Master/Slave) for the Cloud Controller (CLC).

NOTE: Note that this tutorial is an early draft and might contain typos/erros that I might have not noticed. Also, this might not also work for you, that’s why I first recommend to have a UEC up and running with one CLC, and then add the second CLC. If you need help or guidance, you know where to find me :). Also note that this is only for testing purposes!, and I’ll be moving this HowTo to an Ubuntu Wiki page soon since the formatting seems to be somehow annoying :).

1. Installation Considerations
I’ll show you how to configure two UEC (eucalyptus) Cloud Controllers in High Availability (Active/Passive) , using the HA Clustering tools (Pacemaker, Heartbeat), and DRBD for replication between CLC’s. This is shown in the following image.

The setup I used is a 4 node setup, 1 CLC, 1 Walrus, 1 CC/SC, 1 NC, as it is detailed in the UEC Advanced Installation Doc, however, I installed the packages from the Ubuntu Server Installer. Now, as per the UEC Advanced Installation Doc, it is assumed that there is only one network interface (eth0) in the Cloud Controller connected to a “public network” that connects it to both, the outside world and the other components in the Cloud. However, to be able to provide HA be need the following requirements:

  • First, we need a Virtual IP (VIP) to allow both, the clients and the other Controllers to access either one of the CLC’s using that single IP. In this case, we are assuming that the “public network” is 192.168.0.0/24, and that the VIP is 192.168.0.100. This VIP will also be used to generate the new certificates.
  • Second, we need to add a second network interface to the CLC’s to use it as a replication link between DRBD. This second interface is eth1 and will have address ranged in 10.10.10.0/30.

2. Install Second Cloud Controller (CLC2)
Once you finish setting up the UEC and everything is working as expected, please install a second cloud controller.
Once installed, it is desirable to not start the services just yet. However, you will need to exchange the CLC ssh keys with both the CC and the Walrus as it is specified in SSH Key Authentication Setup, under STEP4 of the UEC Advanced Installation doc. Please note that this second CLC will also have two interfaces, eth0 and eth1. Leave eth1 unconfigured, but configure eth0 with an IP address in the same network as the other controllers.

3. Configure Second Network Interface
Once the two CLC’s are installed (CLC1 and CLC2), we need to configure eth1. This interface will be used as a direct link between CLC1 and CLC2 and will be used by DRBD as the replication link. In this example, we’ll be using 10.10.10.0/30. On your /etc/network/interfaces.

On CLC1:

auth eth1
iface eth1 inet static
address 10.10.10.1
netmask 255.255.255.252

On CLC2:

auth eth1
iface eth1 inet static
address 10.10.10.2
netmask 255.255.255.252

NOTE: Do *NOT* add the gateway because it is a direct link between CLC’s. If we add it, it will create a default route the configuration of the resources will fail further along the way.

4. Setting up DRBD

Once the CLC2 is installed and configured, we need to setup DRBD for replication between CLC’s.

4.1. Create Partitions (CLC1/CLC2)
For this, we either need a new disk or disk partition. In my case, I’ll be using /dev/vdb1. Please note that both partitions need to be exactly equal in both nodes. You can create them whichever way you prefer.

4.2. Install DRBD and load module (CLC1/CLC2)
Now we need to install DRBD Utils.

sudo apt-get install drbd

Once it is installed, we need to load the kernel module, and add it is /etc/modules. Please note that DRBD Kernel Module is now included in mainline kernel.

sudo modprobe drbd
sudo -i
echo drbd >> /etc/modules

4.3. Configuring the DRBD resource (CLC1/CLC2)
Add a new resource for DRBD by editing the following file:

sudo vim /etc/drbd.d/uec-clc.res

The configuration looks similar as the following:

resource uec-clc {
device /dev/drbd0;
disk /dev/vdb1;
meta-disk internal;
on clc1 {
address 10.10.10.1:7788;
}
on clc2 {
address 10.10.10.2:7788;
}
syncer {
rate 10M;
}
}

4.4. Creating the resource (CLC1/CLC2)
Now we need to do the following on CLC1 and CLC2:

sudo drbdadm create-md uec-clc
sudo drbdadm up uec-clc

4.5. Establishing initial communication (CLC1)
Now, we need to do the following:

sudo drbdadm -- --clear-bitmap new-current-uuid uec-clc
sudo drbdadm primary uec-clc
mkfs -t ext4 /dev/drbd0

4.6. Copying the Cloud Controller Data for DRBD Replication (CLC1)
Once the DRBD nodes are in sync, we need have the data replicated between the CLC1 and the CLC2 and make the necessary changes so that they both can access the data at a given point in time. To do this, do the following in CLC1:

sudo mkdir /mnt/uecdata
sudo mount -t ext4 /dev/drbd0 /mnt/uecdata
sudo mv /var/lib/eucalyptus/ /mnt/uecdata
sudo mv /var/lib/image-store-proxy/ /mnt/uecdata
sudo ln -s /mnt/uecdata/eucalyptus/ /var/lib/eucalyptus
sudo ln -s /mnt/uecdata/image-store-proxy/ /var/lib/image-store-proxy
sudo umount /mnt/uecdata

What we did here is to move the Cloud Controller data to the DRBD mount point so that it get’s replicated to the second CLC, and then do a symlink from the mountpoint to the original data folders.

4.7. Preparing the second Cloud Controller (CLC2)
Once we prepared the data in CLC1, we can discard the data in CLC2, and we need to create the symlinks the same way we did in the CLC1. We do this as follows:

sudo mkdir /mnt/uecdata
sudo rm -fr /var/lib/eucalyptus
sudo rm -fr /var/lib/image-store-proxy
sudo ln -s /mnt/uecdata/eucalyptus/ /var/lib/eucalyptus
sudo ln -s /mnt/uecdata/image-store-proxy/ /var/lib/image-store-proxy

After this, the data will be replicated via DRBD. Whenever CLC1.

5. Setup the Cluster

5.1. Install the Cluster Tools
First we need to install the clustering tools:

sudo apt-get install heartbeat pacemaker

5.2. Configure Heartbeat
Then we need to configure Heartbeat. First, create /etc/ha.d/ha.cf and add the following:

autojoin none
mcast eth0 239.0.0.43 649 1 0
warntime 5
deadtime 15
initdead 60
keepalive 2
node clc1
node clc2
crm respawn

Then create the authentication file (/etc/ha.d/authkeys), ad add the following:

auth1
1 md5 password

and change the permissions:

sudo chmod 600 /etc/ha.d/authkeys

5.3. Removing Startup of services at boot up
We need to let the Cluster manage the resources, instead of starting them at bootup.

sudo update-rc.d -f eucalyptus remove
sudo update-rc.d -f eucalyptus-cloud remove
sudo update-rc.d -f eucalyptus-network remove
sudo update-rc.d -f image-store-proxy remove

And we also need to change the “start on” to “stop on” in the upstart configuration scripts at /etc/init/* for:

eucalyptus.conf
eucalyptus-cloud.conf
eucalyptus-network.conf

5.4. Configuring the resources
Then, we need to configure the cluster resources. For this do the following:

sudo crm configure

and paste the following:

primitive res_fs_clc ocf:heartbeat:Filesystem params device=/dev/drbd/by-res/uec-clc directory=/mnt/uecdata fstype=ext4 options=noatime
primitive res_ip_clc ocf:heartbeat:IPaddr2 params ip=192.168.0.100 cidr_netmask=24 nic=eth0
primitive res_ip_clc_src ocf:heartbeat:IPsrcaddr params ipaddress="192.168.0.100"
primitive res_uec upstart:eucalyptus  op start timeout=120s op stop timeout=120s op monitor interval=30s
primitive res_uec_image_store_proxy lsb:image-store-proxy
group rg_uec res_fs_clc res_ip_clc res_ip_clc_src res_uec res_uec_image_store_proxy
primitive res_drbd_uec-clc ocf:linbit:drbd params drbd_resource=uec-clc
ms ms_drbd_uec res_drbd_uec-clc meta notify=true
order o_drbd_before_uec inf: ms_drbd_uec:promote rg_uec:start
colocation c_uec_on_drbd inf: rg_uec ms_drbd_uec:Master
property stonith-enabled=False
property no-quorum-policy=ignore

6. Specify the Cloud IP for the CC, NC, and in the CLC.
Once you finish the configuration above, one of the CLC’s will be the Active one and the Second will the passive one. The Cluster Resource Manager will decide which one will become the primary one. However, it is expected that CLC1 will become the primary.

Now, as specified in the UEC Advanced Installation Doc, we need to specify the Cloud Controller VIP in the CC. However it is also important to do it in the NC. This is done in /etc/eucalyptus/eucalyptus.conf by adding:

VNET_CLOUDIP="192.168.0.100"

Then, log into the Web Front end (192.168.0.100:8443), and change the Cloud Configuration to have the VIP as the Cloud Host.

By doing this you will have the new certificates generated with the VIP, that will allow you to connect to the cloud even if the primary Cloud Controller failed, and the Second one tool control of the service.

Finally, restart the Walrus, CC/SC, and NC and enjoy.

7. Final Thoughts
The cluster resource manager is using the upstart script to manage the Cloud Controller. However, this is not optimal, and it is used for testing purposes. The creation of an OCF Resource Agent will be required to adequately start/stop and monitor eucalyptus. The OCF RA will be developed soon, and this will be discussed at Ubuntu Developer Summit – Natty.

DRBD and NFS

Hey all. Sorry for the delay in posting this tutorial, I’ve been pretty busy and I finally had some time to finish it. Enjoy :).

Well, as you may know, in previous posts (Post 1, Post 2) I’ve showed you how to install and configure DRBD in an active/passive configuration with failover, automatically, using Heartbeat. Now, I’m going to show you how to use that configuration to export the data stored in the DRBD device (make it available for other servers in a network) using NFS.

So, the first thing to do is to install NFS in both drbd1 and drbd2, stop the daemon, and remove it from the upstart process (This means that we’ll have to remove the NFS daemon from starting during the boot up process). We do this as follows in both servers:

:~$ sudo apt-get install nfs-kernel-server nfs-common
:~$ sudo /etc/init.d/nfs-kernel-server stop
:~$ sudo update-rc.d -f nfs-kernel-server remove
:~$ sudo update-rc.d -f nfs-common remove

Now, you may wonder how is NFS going to work. The NFS daemon will be working only in the active (or primary) server, and this is going to be controlled by Heartbeat. But, since NFS stores information in /var/lib/nfs, on each server, we have to make both servers have the same information. This is because if drbd1 goes down, drbd2 will take over, but its information in /var/lib/nfs will be different from drbd1‘s info, and this will stop NFS from working. So, to make both servers have the same information in /var/lib/nfs, we are going to copy this information to the DRBD device and create a symbolic link, so that the information gets stored in the DRBD device and not in the actual disk. This way, both servers will have the same information. To do this, we do as follows in the primary server (drbd1):

mv /var/lib/nfs/ /data/
ln -s /data/nfs/ /var/lib/nfs
mkdir /data/export

After that, since we already copied the NFS lib files to the DRBD device in the primary server (drbd1), we have to remove them from the secondary server (drbd2) and create the link.

rm -rf /var/lib/nfs/
ln -s /data/nfs/ /var/lib/nfs

Now, since Heartbeat is going to control the NFS daemon, we have to tell Heartbeat to start the nfs-kernel-server daemon whenever it takes the control of the server. We do this in /etc/ha.d/haresources and we add nfs-kernel-server at the end. The file should look like this:

drbd1 IPaddr::172.16.0.135/24/eth0 drbddisk::testing Filesystem::/dev/drbd0::/data::ext3 nfs-kernel-server

Now that we’ve configured everything, we have to power off both servers, first the secondary and then the primary. Then we start the primary server, and during the boot up process we’ll see a message that will require us to type “yes” (This is the same message showed during the installation of DRBD in my first post).  After confirming, and If you have stonith configured, it is probable that drbd1 wont start its DRBD device, so it will remain as secondary, and won’t be able to mount it. This is because we will have to tell stonith to take over the service (To see if stonith is the problem, we can take a look at /var/log/ha-log). So, to do this, we do as follows in the primary server (drbd1):

meatclient -c drbd2

After doing this, we have to confirm. After the confirmation, Heartbeat will take the control, change the DRBD device to primary, and start NFS. Then, we can boot up the secondary server (drbd2). Enojy :-).

Note: I made the choice of powering off both servers. You could just restart them, one at a time, and see what happens :).

Installing DRBD on Hardy Part 2.

As you know, in a previous post I showed how to install DRBD in Hardy Heron, in an active/passive configuration. Now, I’m gonna show you how to install and configure Heartbeat to automatically monitor this active/passive configuration, and provide High Availability. This means I’ll show you how to integrate DRBD in a simple Heartbeat V1 configuration, and as a plus, I’ll show you how to use the meatware software provided by STONITH.

To do this, we have to install Heartbeat and make changes in three files, which are /etc/ha.d/ha.cf, /etc/ha.d/haresources and /etc/ha.d/authkeys. First of all we install Heartbeat as follows, in both nodes:

sudo apt-get install heartbeat-2

After the installation is completed, the first file we need to configure, in both nodes, is /etc/ha.d/ha.cf as follows:

logfile /var/log/ha-log
keepalive 2
deadtime 30
udpport 695
bcast eth0
auto_failback off
node drbd1 drbd2

Note: Notice that the auto_failback option is in off. This means that if the drbd1 fails, drbd2 will take control over the service, and if drbd1 comes back online, drbd2 will not failback to drbd1 and drbd2 will remain as the active node.

Now, as you know this is an active/passive configuration, so we have to decide which node is going to be the primary and which node is going to be the secondary one, for the Heartbeat configuration. (If you have followed my previous post, the drbd1 node is going to be our primary node, and drbd2 will be our secondary node). We also have to consider here is where are we going to mount the DRBD resource in our filesystem, and which IP address is going to used as the VIP (The Virtual IP is going to be used to access a service, or a DRBD resource, over the network, since we are going to use DRBD for NFS and/or MySQL).

So, assuming drbd1 is the primary node, the VIP is 172.16.0.135 and we are going to mount the DRBD resource in /data (so create the directory in both nodes), we edit /etc/ha.d/haresources as follows, in both nodes:

drbd1  IPaddr::172.16.0.135/24/eth0 drbddisk::testing Filesystem::/dev/drbd0::/data::ext3

Note: Notice that we are specifying the DRBD resource with the drbddisk option.

Then, we have to edit the /etc/ha.d/authkeys file, which is going to be used by Heartbeat to authenticate with the other node. So, we edit it as follows in both nodes:

auth 3
3 md5 DesiredPassword

Finally, we change file permissions to this last file as follows:

sudo chmod 600 /etc/ha.d/authkeys

Now that we have both nodes configured, I recommend you to power off both nodes and boot the node we want to have as primary. In our case, it is drbd1. After booting up this node, we need to verify that Heartbeat has started the DRBD resource (we can see this with cat /proc/drbd) and mounted it in /data. If it has, start the secondary node and verify that it is the secondary one.

If everything has gone right, try the failover process powering off the primary node. You will notice that the node that had the DRBD resource as secondary, it’s now the primary one and it has control over the service. Also verify is the VIP address is working (should appear as eth0:0 issuing ifconfig).

After verifying everything is working as expected, it is always recommendable to have a Fencing device to ensure data integrity. This fencing device will prevent an Split-Brain condition. A well-known Fencing mechanism is known as STONITH (Shoot the Other Node in the Head). This mechanism will basically power off or reset a node which is supposed to be dead. This means that if drbd1 is supposed to be dead, drbd2 will take control of the service or, in this case, the DRBD resource. But, if drbd1 is not actually dead and drbd2 tries to take control over the shared DRBD resource, an Split-Brain condition will occur. So STONITH will ensure that drbd1 has been reseted or powered off so that drbd2 can take control of the DRBD resource.

To do this, there is an stonith package that is used to work with STONITH/Fencing devices in Heartbeat. But, since we don’t have a real Fencing device, we will use meatware. Meatware is a software provided by STONITH, that simulates the use of a STONITH/Fencing device, by not allowing the secondary node (drbd2) to take control over the shared resources If there has not been a confirmation that the primary node (drbd1) has been powered off or rebooted. This requires operator intervention. So to integrate the meatware software with Heartbeat we do as follows:

sudo apt-get install stonith

Then, we have to modify /etc/ha.d/ha.cf like this (in both nodes):

... [Output Ommitted]
auto_failback off
stonith_host drbd1 meatware drbd2
stonith_host drbd2 meatware drbd1

node drbd1 drbd2

Then we power off the primary node (in this case drbd1) and reboot secondary node (drbd2). After reboot, take a look at /var/log/ha-log until you see something like this:

ha-log

As you can see, STONITH is sending a message that says that we need to confirm that drbd1 has been rebooted so that drbd2 can take control over the service. So, take a look at /proc/drbd and you will see that the DRBD resource is still as secondary. So, we do as follows:

sudo meatclient -c drbd1

And we will see something like this:

meatclient

Now, we confirm that it has been rebooted and we can now see that drbd2 will take control of the resource by setting the DRBD resource as primary (cat /proc/drbd).

So we can now start drbd1. Everytime the primary node fails, the secondary node will display a message on /var/log/ha-log saying that we should confirm that the primary has been rebooted so that the secondary can take control of the service and become the new primary.

In a next post I’ll cover how to make NFS make use of DRBD. Any comments, suggestions, etc, you know where to find me :).

Installing DRBD On Hardy!

DRBD (Distributed Replicated Block Device) is a technology that is used to replicate data over TCP/IP. It is used to build HA Clusters and it can be seen as a RAID-1 implementation over the network.

As you may all know, the DRBD kernel module has now been included into Hardy Heron Server Edition’s kernel, so there is no more source downloading and compiling, which makes it easier to install and configure. Here I’ll show you how to install and and make a simple configuration of DRBD, using one resource (testing). I’ll not cover how to install and configure heartbeat for automatic failover (This will be showed in a next post).

First of all, we will have to install Ubuntu Hardy Heron Server Edition on to servers and manually edit the partition table. We do this to leave FREE SPACE that we will be used later on as the block device for DRBD. If you’ve seen the DRBD + NFS HowTo on HowToForge.com, creating the partitions for DRBD and leaving them unmounted will NOT work, and we won’t we able to create the resource for DRBD. This is why we leave the FREE SPACE, and we will create the partition later on, when the system is installed.

So, after the installation we will have to create the partition, or partitions (in case we are creating an external partition for the meta-data, but in this case it will be internal), that DRBD will use as a block device. For this we will use fdisk and do as follows:


fdisk /dev/sda
n (to create a new partition)
l83 (to create it as logical and format it as Filesystem # 83)
w (to write the changes)

After creating the partitions we will have to REBOOT both servers so that the kernel uses the new partition table. After reboot we have to install drbd8-utils on both servers:

sudo apt-get install drbd8-utils

Now that we have drbd8-utils installed, we can now configure /etc/drbd.conf, which we will configure a simple DRBD resource, as follows:

resource testing { # name of resources

protocol C;

on drbd1 { # first server hostname
device /dev/drbd0; # Name of DRBD device
disk /dev/sda7; # Partition to use, which was created using fdisk
address 172.16.0.130:7788; # IP addres and port number used by drbd
meta-disk internal; # where to store metadata meta-data
}

on drbd2 { # second server hostname
device /dev/drbd0;
disk /dev/sda7;
address 172.16.0.131:7788;
meta-disk internal;
}

disk {
on-io-error detach;
}

net {
max-buffers 2048;
ko-count 4;
}

syncer {
rate 10M;
al-extents 257;
}

startup {
wfc-timeout 0;
degr-wfc-timeout 120; # 2 minutos.
}
}

Note that we are using drbd1 and drbd2 as hostnames. This hostnames must be configured and the servers should be able to ping the other via those hostnames (that means we need to have a DNS server or configure hosts for both servers in /etc/hosts).

After creating the configuration in /etc/drbd.conf, we now can create the DRBD resources. For this we issue the following in both servers:

sudo drbdadm create-md testing

After issuing this, we will be asked for confirmation to create the meta-data in the block device.

Now we have to power off both servers. After powering them off, we start our first server and we will see something similar to this:

After confirming with ‘yes’, we can now start the second server. After the second server is running. both nodes resources are secondary, so we have to make one of them primary. For this, we issue the following on the server we would like to have the resource as primary:

drbdadm -- --overwrite-data-of-peer primary all

We verify this by issuing:

cat /proc/drbd

And this should show something like this:

Well, up to this point, i’ve have showed you how to install and configure DRBD on Hardy, and how to make one of the servers have its resource as primary. But, we still don’t have automatic failover or automatic mounting. In a next post I’ll show how to configure heartbeat to have automatic failover and to take control of the resources, aswell as configuring STONITH to use the meatware device, so that we won’t have a split-brain condition (or at least try). I’ll also show how to configure NFS and MySQL to use this DRBD resource.

BTW, if you have questions you know where to find me :).