Getting Started

Clone the GlusterFS repo containing the necessary Kubernetes specs:

git clo=
ne https://github.com/nds-org/gluster.git
cd gluster/

Server Setup

Create the gluster-server DaemonSet using kubectl:

kubectl=
 create -f kubernetes/gluster-server-ds.yaml

This spec runs the ndslabs/gluster container in "server mode" on Kuberne= tes nodes labeled with ndslabs-role=3Dstorage.

Once all of the server containers are up, we must tell them to cooperate= with each other using the gluster CLI.

The steps below then only to be done from inside of a single glusterfs-s= erver container.

Alternative:= Raw Docker

docker =
run --name=3Dgfs --net=3Dhost --pid=3Dhost --privileged -v /dev:/dev -v <=
;ABSOLUTE_PATH_TO_SHARED_DATA>:/var/glfs -v /run:/run -v /:/media/host -=
it -d gluster:local

Gettin= g into a Server Container

Using kubectl, exec into one of the GlusterFS servers= :

core@wi=
llis8-k8-test-1 ~ $ kubectl get pods -o wide
NAME                         READY     STATUS    RESTARTS   AGE       NODE
coffee-rc-4u3pb              1/1       Running   0          12d       192.1=
68.100.65
coffee-rc-5m4t6              1/1       Running   0          12d       192.1=
68.100.65
default-http-backend-y98iw   1/1       Running   0          22h       192.1=
68.100.64
glusterfs-server-hh5rm       1/1       Running   0          5d        192.1=
68.100.156
glusterfs-server-zoefs       1/1       Running   0          5d        192.1=
68.100.89
ndslabs-apiserver-zqgj8      1/1       Running   0          1d        192.1=
68.100.66
ndslabs-gui-p0hjh            1/1       Running   0          23h       192.1=
68.100.66
nginx-ilb-rc-x853y           1/1       Running   0          6d        192.1=
68.100.64
tea-rc-8saiu                 1/1       Running   0          12d       192.1=
68.100.65
tea-rc-t403k                 1/1       Running   0          12d       192.1=
68.100.65
core@willis8-k8-test-1 ~ $ kubectl exec -it glusterfs-server-zoefs bash

Take note of all node IPs that are running glusterfs-server pods. You wi= ll need these IPs to finish configuring GlusterFS.

Peer Probe

Once inside of the gluster server container, perform a peer probe on all= other gluster nodes.

Do not probe the host's own IP.

For example, since we are executing from 192.168.100.89, we must probe o= ur other storage node:

root@wi=
llis-k8-test-gluster:/# gluster peer probe 192.168.100.156

Create Volume

Ansible has already created the placeholder directories for bricks, we j= ust need to create and start a Gluster volume pointing to the different bri= ck directories on each node.

This is done using gluster create volume as outlines below= :

root@wi=
llis-k8-test-gluster:/# gluster volume create ndslabs transport tcp 192.168=
.100.89:/var/glfs/brick0 192.168.100.156:/var/glfs/ndslabs/brick0

NOTE: Our Ansible playbook mounts GlusterFS bricks at /media/brick0. We = will need to update this in the future to be consistent throughout.

To be sure the volume was created successfully, you can run the followin= g commands and see your new volume:

root@wi=
llis-k8-test-gluster:/# gluster volume list
ndslabs
root@willis-k8-test-gluster:/# gluster volume status
Volume ndslabs is not started

Reusing a Volume

Simply add force to the end of your v= olume create command to force GlusterFS to reuse a volume that is = no longer accessible:

root@wi=
llis-k8-test-gluster:/# gluster volume create ndslabs transport tcp 192.168=
.100.89:/media/brick0/brick/ndslabs 192.168.100.156:/media/brick0/brick/nds=
labs
volume create: ndslabs: failed: /media/brick0/brick/ndslabs is already part=
 of a volume
root@willis-k8-test-gluster:/# gluster volume create ndslabs transport tcp =
192.168.100.89:/media/brick0/brick/ndslabs 192.168.100.156:/media/brick0/br=
ick/ndslabs force
volume create: ndslabs: success: please start the volume to access data

The alternative solution would be to delete / recreate the mount point:<= /p>

root@wi=
llis-k8-test-gluster:/# rm -rf /path/to/brick0
root@willis-k8-test-gluster:/# mkdir -p /path/to/brick0

Start Volume

Now that we have created our volume, we must start it in order for clien= ts to mount it:

root@wi=
llis-k8-test-gluster:/# gluster volume start ndslabs
volume start: ndslabs: success

Our volume is now being served out to the cluster over NFS, and we are r= eady for our clients to mount the volume.

Adding a Brick

Suppose we have a simple replicated gluster volume with 2 bricks, and we= are running low on space... we want to expand the storage it contains:

# On th=
e host node, via SSH
core@workshop1-node1 ~ $ df
Filesystem             1K-blocks     Used Available Use% Mounted on
devtmpfs                16460056        0  16460056   0% /dev
tmpfs                   16476132        0  16476132   0% /dev/shm
tmpfs                   16476132     1872  16474260   1% /run
tmpfs                   16476132        0  16476132   0% /sys/fs/cgroup
/dev/vda9               38216204   256716  36301140   1% /
/dev/mapper/usr          1007760   639352    316392  67% /usr
tmpfs                   16476132    17140  16458992   1% /tmp
tmpfs                   16476132        0  16476132   0% /media
/dev/vda1                 130798    39292     91506  31% /boot
/dev/vda6                 110576       64    101340   1% /usr/share/oem
/dev/vdb                41922560  6023596  35898964  15% /var/lib/docker
/dev/vdc                10475520   626268   9849252   6% /media/storage
/dev/vdd               104806400 49157880  55648520  47% /media/brick0
192.168.100.122:global 104806400 87618944  17187456  84% /var/glfs/global
tmpfs                    3295224        0   3295224   0% /run/user/500
/dev/vde               209612800    32928 209579872   1% /media/brick1


# Inside of the GLFS server pod
root@workshop1-node1:/# gluster volume info global
=20
Volume Name: global
Type: Replicate
Volume ID: ca59a98e-c959-454e-8ac3-9082b0ed2856
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 =3D 2
Transport-type: tcp
Bricks:
Brick1: 192.168.100.122:/media/brick0/brick
Brick2: 192.168.100.116:/media/brick0/brick
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on

Provision and attach a new OpenStack volume to your existing instance, t= hen format it with XFS:

core@wo=
rkshop1-node1 ~ $ sudo mkfs -t xfs /dev/vde
meta-data=3D/dev/vde               isize=3D256    agcount=3D4, agsize=3D131=
07200 blks
         =3D                       sectsz=3D512   attr=3D2, projid32bit=3D1
         =3D                       crc=3D0        finobt=3D0
data     =3D                       bsize=3D4096   blocks=3D52428800, imaxpc=
t=3D25
         =3D                       sunit=3D0      swidth=3D0 blks
naming   =3Dversion 2              bsize=3D4096   ascii-ci=3D0 ftype=3D0
log      =3Dinternal log           bsize=3D4096   blocks=3D25600, version=
=3D2
         =3D                       sectsz=3D512   sunit=3D0 blks, lazy-coun=
t=3D1
realtime =3Dnone                   extsz=3D4096   blocks=3D0, rtextents=3D0=

You will then need to build up a *.mount file as below:

$ vi me=
dia-brick1.mount
[Unit]
Description=3DMount OS_DEVICE on MOUNT_PATH
After=3Dlocal-fs.target

[Mount]
What=3DOS_DEVICE
Where=3DMOUNT_PATH
Type=3DFS_TYPE
Options=3Dnoatime

[Install]
WantedBy=3Dmulti-user.target

where:

OS_DEVICE is the source device in /dev where = your raw volume is mounted (i.e. /dev/vde)
MOUNT_PATH is the target mount path where your data sh= ould be mounted (i.e. /media/brick1)
FS_TYPE is a string of which filesystem will be format= ted on the new volume (i.e. xfs)

Place this file in /etc/systemd/system/

Finally, start and enable your service to mount the volume to CoreOS and= ensure it is remounted on restart:

sudo mv=
 media-brick1.mount /etc/systemd/system/media-brick1.mount
sudo systemctl daemon-reload
sudo systemctl start media-brick1.mount
sudo systemctl enable  media-brick1.mount
sudo systemctl unmask  media-brick1.mount

You will need to perform the above steps on each of your GLFS servers befo= re continuing

Now you'll need to exec into one of the GLFS server pods and perform the= following:

# Peer =
probe the other IP in the cluster (gluster service IP also seems to work)
$ gluster peer probe 10.254.202.236=20
peer probe: success. Host 192.168.100.1 port 24007 already in peer list


# This one fails because we did not include our new brick's second replica
$ gluster volume add-brick global 192.168.100.122:/media/brick1            =
                                                                           =
     =20
volume add-brick: failed: Incorrect number of bricks supplied 1 with count =
2


# This one fails because we need a sub-directory of the mount point
$ gluster volume add-brick global 192.168.100.122:/media/brick1 192.168.100=
.116:/media/brick1=20
volume add-brick: failed: The brick 192.168.100.116:/media/brick1 is a moun=
t point. Please create a sub-directory under the mount point and use that a=
s the brick directory. Or use 'force' at the end of the command if you want=
 to override this behavior.


# This one works! :D
$ gluster volume add-brick global 192.168.100.122:/media/brick1/brick 192.1=
68.100.116:/media/brick1/brick
volume add-brick: success

And now we can see that our new brick has been added to the existing vol= ume:

core@wo=
rkshop1-node1 ~ $ df
Filesystem             1K-blocks     Used Available Use% Mounted on
devtmpfs                16460056        0  16460056   0% /dev
tmpfs                   16476132        0  16476132   0% /dev/shm
tmpfs                   16476132     1792  16474340   1% /run
tmpfs                   16476132        0  16476132   0% /sys/fs/cgroup
/dev/vda9               38216204   256736  36301120   1% /
/dev/mapper/usr          1007760   639352    316392  67% /usr
tmpfs                   16476132    17140  16458992   1% /tmp
tmpfs                   16476132        0  16476132   0% /media
/dev/vda1                 130798    39292     91506  31% /boot
/dev/vda6                 110576       64    101340   1% /usr/share/oem
/dev/vdb                41922560  6023732  35898828  15% /var/lib/docker
/dev/vdc                10475520   626360   9849160   6% /media/storage
/dev/vdd               104806400 49157820  55648580  47% /media/brick0
192.168.100.122:global 314419200 49191424 265227776  16% /var/glfs/global
tmpfs                    3295224        0   3295224   0% /run/user/500
/dev/vde               209612800    33088 209579712   1% /media/brick1


root@workshop1-node1:/# gluster volume info global
=20
Volume Name: global
Type: Distributed-Replicate
Volume ID: ca59a98e-c959-454e-8ac3-9082b0ed2856
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 =3D 4
Transport-type: tcp
Bricks:
Brick1: 192.168.100.122:/media/brick0/brick
Brick2: 192.168.100.116:/media/brick0/brick
Brick3: 192.168.100.122:/media/brick1/brick
Brick4: 192.168.100.116:/media/brick1/brick
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on

Client Setup

Create the gluster-client DaemonSet using kubectl:

kubectl=
 create -f kubernetes/gluster-client-ds.yaml

This spec runs the ndslabs/gluster container in "client mode" on Kuberne= tes nodes labeled with ndslabs-role=3Dcompute.

Once each client container starts, it will mount the GlusterFS volume to= each compute host using NFS.

Testing

Once the clients are online, we can run a simple test of GlusterFS to en= sure that it is correctly serving and synchronizing the volume.

From the Kubernetes master, run the following command to see which nodes= are running the glusterfs-client containers:

core@wi=
llis8-k8-test-1 ~ $ kubectl get pods -o wide
NAME                         READY     STATUS    RESTARTS   AGE       NODE
coffee-rc-4u3pb              1/1       Running   0          12d       192.1=
68.100.65
coffee-rc-5m4t6              1/1       Running   0          12d       192.1=
68.100.65
default-http-backend-y98iw   1/1       Running   0          23h       192.1=
68.100.64
glusterfs-client-4hm9y       1/1       Running   0          5d        192.1=
68.100.65
glusterfs-client-6c12y       1/1       Running   0          5d        192.1=
68.100.66
glusterfs-server-hh5rm       1/1       Running   0          5d        192.1=
68.100.156
glusterfs-server-zoefs       1/1       Running   0          5d        192.1=
68.100.89
ndslabs-apiserver-zqgj8      1/1       Running   0          1d        192.1=
68.100.66
ndslabs-gui-p0hjh            1/1       Running   0          23h       192.1=
68.100.66
nginx-ilb-rc-x853y           1/1       Running   0          6d        192.1=
68.100.64
tea-rc-8saiu                 1/1       Running   0          12d       192.1=
68.100.65
tea-rc-t403k                 1/1       Running   0          12d       192.1=
68.100.65

Create two SSH sessions - one into each compute node (in this case, 192.= 168.100.65 and 192.168.100.66).

First Session

In one SSH session, run a BusyBox image mounted with our shared volume:<= /p>

docker =
run -v /var/glfs:/var/glfs --rm -it busybox

Inside of the BusyBox container, create a test file:

echo "t=
esting!" > /var/glfs/ndslabs/test.file

Second Session

On the other machine, test that mapping the same directory into BusyBox = we can see the changes from the first host:

docker =
run -v /var/glfs:/var/glfs --rm -it busybox

Running an ls on /var/glfs/ndslabs/= should show the test file created on the other node:

ls -al =
/var/glfs/ndslabs

This proves that we can mount via NFS onto each node, map the NFS mount = into containers, and allow those containers to ingest or modify the data fr= om the NFS mount.

GlusterFS HostHack in Kubernetes

Getting Started Clone the GlusterFS repo containing the necessary Kubernetes specs: git clo= ne https://github.com/nds-org/gluster.git cd gluster/