Introduction link
DRBD is a system that allows you to create software RAID1 over a local network.
This enables high availability and resource sharing on a cluster without a disk array.
Here we will install DRBD8, with the goal of implementing a cluster filesystem (see documentation on OCFS2) which is not supported on DRBD7.
We’ll use the DRBD8 packages from Debian repositories. We’ll work on a 2-node cluster.
Installation link
First, install the following packages:
1
| aptitude install drbd8-utils
|
Then we’ll load the module and make it persistent (for future reboots):
1
2
| modprobe drbd
echo "drbd" >> /etc/modules
|
Configuration link
drbd.conf link
The drbd.conf file is pretty good by default as it allows you to write an extensible configuration:
1
2
3
4
| # You can find an example in /usr/share/doc/drbd.../drbd.conf.example
include "drbd.d/global_common.conf";
include "drbd.d/*.res";
|
I didn’t modify it.
global_common.conf link
This file is the default file, which can contain host configurations, but also allows you to have a global configuration for your different DRBD configurations (common section):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
| # Global configuration
global {
# Do not report statistics usage to LinBit
usage-count no;
}
# All resources inherit the options set in this section
common {
# C (Synchronous replication protocol)
protocol C;
startup {
# Wait for connection timeout (in seconds)
wfc-timeout 1 ;
# Wait for connection timeout, if this node was a degraded cluster (in seconds)
degr-wfc-timeout 1 ;
}
net {
# Maximum number of requests to be allocated by DRBD
max-buffers 8192;
# The highest number of data blocks between two write barriers
max-epoch-size 8192;
# The size of the TCP socket send buffer
sndbuf-size 512k;
# How often the I/O subsystem's controller is forced to process pending I/O requests
unplug-watermark 8192;
# The HMAC algorithm to enable peer authentication at all
cram-hmac-alg sha1;
# The shared secret used in peer authentication
shared-secret "xxx";
# Split brains
# Split brain, resource is not in the Primary role on any host
after-sb-0pri disconnect;
# Split brain, resource is in the Primary role on one host
after-sb-1pri disconnect;
# Split brain, resource is in the Primary role on both host
after-sb-2pri disconnect;
# Helps to solve the cases when the outcome of the resync decision is incompatible with the current role assignment
rr-conflict disconnect;
}
handlers {
# If the node is primary, degraded and if the local copy of the data is inconsistent
pri-on-incon-degr "echo Current node is primary, degraded and the local copy of the data is inconsistent | wall ";
}
disk {
# The node downgrades the disk status to inconsistent on io errors
on-io-error pass_on;
# Disable protecting data if power failure (done by hardware)
no-disk-barrier;
# Disable the backing device to support disk flushes
no-disk-flushes;
# Do not let write requests drain before write requests of a new reordering domain are issued
no-disk-drain;
# Disables the use of disk flushes and barrier BIOs when accessing the meta data device
no-md-flushes;
}
syncer {
# The maximum bandwidth a resource uses for background re-synchronization
rate 500M;
# Control how big the hot area (= active set) can get
al-extents 3833;
}
}
|
I’ve commented all my changes.
Now we’ll create a file to add our resource 0:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| resource r0 {
# Node 1
on srv1 {
device /dev/drbd0;
# Disk containing the drbd partition
disk /dev/mapper/datas-drbd;
# IP address of this host
address 192.168.100.1:7788;
# Store metadata on the same device
meta-disk internal;
}
# Node 2
on srv2 {
device /dev/drbd0;
disk /dev/mapper/lvm-drbd;
address 192.168.20.4:7788;
meta-disk internal;
}
}
|
Synchronization link
We need to launch the first sync now.
On both nodes, run this command:
Still on both nodes, run this command to activate the resource:
We’ll ask the first node to do the first block-by-block replication:
1
| drbdadm -- --overwrite-data-of-peer primary r0
|
Then we’ll have to wait for the sync to finish before continuing:
1
2
3
4
5
6
7
| > cat /proc/drbd
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
ns:912248 nr:0 dw:0 dr:920640 al:0 bm:55 lo:1 pe:388 ua:2048 ap:0 ep:1 wo:b oos:3283604
[===>................] sync'ed: 21.9% (3283604/4194304)K
finish: 1:08:24 speed: 580 (452) K/sec
|
The display of /proc/drbd allows you to see the replication status. At the end, you should have something like this:
1
2
3
4
5
| > cat /proc/drbd
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
ns:0 nr:4194304 dw:4194304 dr:0 al:0 bm:256 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
|
If you want to do dual master, this option must be active in the configuration:
1
2
3
4
5
6
7
8
9
| resource <resource>
startup {
become-primary-on both;
}
net {
protocol C;
allow-two-primaries yes;
}
}
|
Now we can activate the other node as primary:
Once the synchronization is complete, DRBD is installed and properly configured.
You now need to format the device /dev/drbd0
with a filesystem, such as ext3 for active/passive or OCFS2 for example if you want active/active (there are others like GFS2).
or
Then mount the volume in a folder to access the data:
1
| mount /dev/drbd0 /mnt/data
|
Only a primary node can mount and access the data on the DRBD volume.
When DRBD works with HeartBeat in CRM mode, if the primary node goes down, the cluster is able to switch the secondary node to primary.
When the old primary is “UP” again, it will synchronize and become a secondary in turn.
Become master link
To set all volumes as primary:
info
Replace all with the name of your volume if you only want to operate on one.
Become slave link
To set a volume as slave:
Manual synchronization link
To start a manual synchronization (will invalidate all your data):
To do the same but on other nodes:
1
| drbdadm invalidate_remote all
|
My sync doesn’t work, I have: Secondary/Unknown link
If you have this type of message:
1
2
3
4
5
| > cat /proc/drbd
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:StandAlone ro:Secondary/Unknown ds:Inconsistent/DUnknown r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:4194304
|
You need to check if the machines are properly configured for your resources and also if they can telnet to each other (firewalling etc…)
What to do in case of split brain? link
If you find yourself in this situation:
1
2
| > cat /proc/drbd
primary/unknown
|
or
- Unmount the drbd volumes
- On the primary:
- On the secondary (this will destroy all data and reimport from the master)
1
| drbdadm -- --discard-my-data connect all
|
Resources link
Last updated
07 Sep 2013, 09:56 CEST. history