Table of Contents |
---|
Getting Started
Clone the GlusterFS repo containing the necessary Kubernetes specs:
Code Block | ||
---|---|---|
| ||
git clone https://github.com/nds-org/gluster.git cd gluster/ |
Server Setup
Create the gluster-server DaemonSet using kubectl:
...
The steps below then only to be done from inside of a single glusterfs-server container.
Alternative: Raw Docker
Code Block | ||
---|---|---|
| ||
docker run --name=gfs --net=host --pid=host --privileged -v /dev:/dev -v <ABSOLUTE_PATH_TO_SHARED_DATA>:/var/glfs -v /run:/run -v /:/media/host -it -d gluster:local |
Getting into a Server Container
Using kubectl, exec into one of the GlusterFS servers:
...
Take note of all node IPs that are running glusterfs-server pods. You will need these IPs to finish configuring GlusterFS.
Peer Probe
Once inside of the gluster server container, perform a peer probe on all other gluster nodes.
...
Code Block | ||
---|---|---|
| ||
root@willis-k8-test-gluster:/# gluster peer probe 192.168.100.156 |
Create Volume
Ansible has already created the placeholder directories for bricks, we just need to create and start a Gluster volume pointing to the different brick directories on each node.
...
Code Block | ||
---|---|---|
| ||
root@willis-k8-test-gluster:/# gluster volume list ndslabs root@willis-k8-test-gluster:/# gluster volume status Volume ndslabs is not started |
Reusing a Volume
Simply add force to the end of your volume create command to force GlusterFS to reuse a volume that is no longer accessible:
...
Code Block | ||
---|---|---|
| ||
root@willis-k8-test-gluster:/# rm -rf /path/to/brick0 root@willis-k8-test-gluster:/# mkdir -p /path/to/brick0 |
Start Volume
Now that we have created our volume, we must start it in order for clients to mount it:
...
Our volume is now being served out to the cluster over NFS, and we are ready for our clients to mount the volume.
Adding a Brick
Suppose we have a simple replicated gluster volume with 2 bricks, and we are running low on space... we want to expand the storage it contains:
Code Block |
---|
# On the host node, via SSH
core@workshop1-node1 ~ $ df
Filesystem 1K-blocks Used Available Use% Mounted on
devtmpfs 16460056 0 16460056 0% /dev
tmpfs 16476132 0 16476132 0% /dev/shm
tmpfs 16476132 1872 16474260 1% /run
tmpfs 16476132 0 16476132 0% /sys/fs/cgroup
/dev/vda9 38216204 256716 36301140 1% /
/dev/mapper/usr 1007760 639352 316392 67% /usr
tmpfs 16476132 17140 16458992 1% /tmp
tmpfs 16476132 0 16476132 0% /media
/dev/vda1 130798 39292 91506 31% /boot
/dev/vda6 110576 64 101340 1% /usr/share/oem
/dev/vdb 41922560 6023596 35898964 15% /var/lib/docker
/dev/vdc 10475520 626268 9849252 6% /media/storage
/dev/vdd 104806400 49157880 55648520 47% /media/brick0
192.168.100.122:global 104806400 87618944 17187456 84% /var/glfs/global
tmpfs 3295224 0 3295224 0% /run/user/500
/dev/vde 209612800 32928 209579872 1% /media/brick1
# Inside of the GLFS server pod
root@workshop1-node1:/# gluster volume info global
Volume Name: global
Type: Replicate
Volume ID: ca59a98e-c959-454e-8ac3-9082b0ed2856
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.100.122:/media/brick0/brick
Brick2: 192.168.100.116:/media/brick0/brick
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on |
Provision and attach a new OpenStack volume to your existing instance, then format it with XFS:
Code Block |
---|
core@workshop1-node1 ~ $ sudo mkfs -t xfs /dev/vde
meta-data=/dev/vde isize=256 agcount=4, agsize=13107200 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=52428800, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=25600, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0 |
You will then need to build up a *.mount file as below:
Code Block |
---|
$ vi media-brick1.mount
[Unit]
Description=Mount OS_DEVICE on MOUNT_PATH
After=local-fs.target
[Mount]
What=OS_DEVICE
Where=MOUNT_PATH
Type=FS_TYPE
Options=noatime
[Install]
WantedBy=multi-user.target |
where:
- OS_DEVICE is the source device in /dev where your raw volume is mounted (i.e. /dev/vde)
- MOUNT_PATH is the target mount path where your data should be mounted (i.e. /media/brick1)
- FS_TYPE is a string of which filesystem will be formatted on the new volume (i.e. xfs)
Place this file in /etc/systemd/system/
Finally, start and enable your service to mount the volume to CoreOS and ensure it is remounted on restart:
Code Block |
---|
sudo mv media-brick1.mount /etc/systemd/system/media-brick1.mount
sudo systemctl daemon-reload
sudo systemctl start media-brick1.mount
sudo systemctl enable media-brick1.mount
sudo systemctl unmask media-brick1.mount |
You will need to perform the above steps on each of your GLFS servers before continuing
Now you'll need to exec into one of the GLFS server pods and perform the following:
Code Block |
---|
# Peer probe the other IP in the cluster (gluster service IP also seems to work)
$ gluster peer probe 10.254.202.236
peer probe: success. Host 192.168.100.1 port 24007 already in peer list
# This one fails because we did not include our new brick's second replica
$ gluster volume add-brick global 192.168.100.122:/media/brick1
volume add-brick: failed: Incorrect number of bricks supplied 1 with count 2
# This one fails because we need a sub-directory of the mount point
$ gluster volume add-brick global 192.168.100.122:/media/brick1 192.168.100.116:/media/brick1
volume add-brick: failed: The brick 192.168.100.116:/media/brick1 is a mount point. Please create a sub-directory under the mount point and use that as the brick directory. Or use 'force' at the end of the command if you want to override this behavior.
# This one works! :D
$ gluster volume add-brick global 192.168.100.122:/media/brick1/brick 192.168.100.116:/media/brick1/brick
volume add-brick: success |
And now we can see that our new brick has been added to the existing volume:
Code Block |
---|
core@workshop1-node1 ~ $ df
Filesystem 1K-blocks Used Available Use% Mounted on
devtmpfs 16460056 0 16460056 0% /dev
tmpfs 16476132 0 16476132 0% /dev/shm
tmpfs 16476132 1792 16474340 1% /run
tmpfs 16476132 0 16476132 0% /sys/fs/cgroup
/dev/vda9 38216204 256736 36301120 1% /
/dev/mapper/usr 1007760 639352 316392 67% /usr
tmpfs 16476132 17140 16458992 1% /tmp
tmpfs 16476132 0 16476132 0% /media
/dev/vda1 130798 39292 91506 31% /boot
/dev/vda6 110576 64 101340 1% /usr/share/oem
/dev/vdb 41922560 6023732 35898828 15% /var/lib/docker
/dev/vdc 10475520 626360 9849160 6% /media/storage
/dev/vdd 104806400 49157820 55648580 47% /media/brick0
192.168.100.122:global 314419200 49191424 265227776 16% /var/glfs/global
tmpfs 3295224 0 3295224 0% /run/user/500
/dev/vde 209612800 33088 209579712 1% /media/brick1
root@workshop1-node1:/# gluster volume info global
Volume Name: global
Type: Distributed-Replicate
Volume ID: ca59a98e-c959-454e-8ac3-9082b0ed2856
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 192.168.100.122:/media/brick0/brick
Brick2: 192.168.100.116:/media/brick0/brick
Brick3: 192.168.100.122:/media/brick1/brick
Brick4: 192.168.100.116:/media/brick1/brick
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on |
Client Setup
Create the gluster-client DaemonSet using kubectl:
...
Once each client container starts, it will mount the GlusterFS volume to each compute host using NFS.
Testing
Once the clients are online, we can run a simple test of GlusterFS to ensure that it is correctly serving and synchronizing the volume.
...
Create two SSH sessions - one into each compute node (in this case, 192.168.100.65 and 192.168.100.66).
First Session
In one SSH session, run a BusyBox image mounted with our shared volume:
...
Code Block | ||
---|---|---|
| ||
echo "testing!" > /var/glfs/ndslabs/test.file |
Second Session
On the other machine, test that mapping the same directory into BusyBox we can see the changes from the first host:
...