Date: Fri, 29 Mar 2024 02:43:37 -0500 (CDT) Message-ID: <1893580238.300.1711698217654@os-confluence.ncsa.illinois.edu> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_299_2026622789.1711698217653" ------=_Part_299_2026622789.1711698217653 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Clone the GlusterFS repo containing the necessary Kubernetes specs:
git clo= ne https://github.com/nds-org/gluster.git cd gluster/
Create the gluster-server DaemonSet using kubectl:
kubectl= create -f kubernetes/gluster-server-ds.yaml
This spec runs the ndslabs/gluster container in "server mode" on Kuberne= tes nodes labeled with ndslabs-role=3Dstorage.
Once all of the server containers are up, we must tell them to cooperate= with each other using the gluster CLI.
The steps below then only to be done from inside of a single glusterfs-s= erver container.
docker = run --name=3Dgfs --net=3Dhost --pid=3Dhost --privileged -v /dev:/dev -v <= ;ABSOLUTE_PATH_TO_SHARED_DATA>:/var/glfs -v /run:/run -v /:/media/host -= it -d gluster:local
Using kubectl, exec into one of the GlusterFS servers= :
core@wi= llis8-k8-test-1 ~ $ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE NODE coffee-rc-4u3pb 1/1 Running 0 12d 192.1= 68.100.65 coffee-rc-5m4t6 1/1 Running 0 12d 192.1= 68.100.65 default-http-backend-y98iw 1/1 Running 0 22h 192.1= 68.100.64 glusterfs-server-hh5rm 1/1 Running 0 5d 192.1= 68.100.156 glusterfs-server-zoefs 1/1 Running 0 5d 192.1= 68.100.89 ndslabs-apiserver-zqgj8 1/1 Running 0 1d 192.1= 68.100.66 ndslabs-gui-p0hjh 1/1 Running 0 23h 192.1= 68.100.66 nginx-ilb-rc-x853y 1/1 Running 0 6d 192.1= 68.100.64 tea-rc-8saiu 1/1 Running 0 12d 192.1= 68.100.65 tea-rc-t403k 1/1 Running 0 12d 192.1= 68.100.65 core@willis8-k8-test-1 ~ $ kubectl exec -it glusterfs-server-zoefs bash
Take note of all node IPs that are running glusterfs-server pods. You wi= ll need these IPs to finish configuring GlusterFS.
Once inside of the gluster server container, perform a peer probe on all= other gluster nodes.
Do not probe the host's own IP.
For example, since we are executing from 192.168.100.89, we must probe o= ur other storage node:
root@wi= llis-k8-test-gluster:/# gluster peer probe 192.168.100.156
Ansible has already created the placeholder directories for bricks, we j= ust need to create and start a Gluster volume pointing to the different bri= ck directories on each node.
This is done using gluster create volume as outlines below= :
root@wi= llis-k8-test-gluster:/# gluster volume create ndslabs transport tcp 192.168= .100.89:/var/glfs/brick0 192.168.100.156:/var/glfs/ndslabs/brick0
NOTE: Our Ansible playbook mounts GlusterFS bricks at /media/brick0. We = will need to update this in the future to be consistent throughout.
To be sure the volume was created successfully, you can run the followin= g commands and see your new volume:
root@wi= llis-k8-test-gluster:/# gluster volume list ndslabs root@willis-k8-test-gluster:/# gluster volume status Volume ndslabs is not started
Simply add force to the end of your v= olume create command to force GlusterFS to reuse a volume that is = no longer accessible:
root@wi= llis-k8-test-gluster:/# gluster volume create ndslabs transport tcp 192.168= .100.89:/media/brick0/brick/ndslabs 192.168.100.156:/media/brick0/brick/nds= labs volume create: ndslabs: failed: /media/brick0/brick/ndslabs is already part= of a volume root@willis-k8-test-gluster:/# gluster volume create ndslabs transport tcp = 192.168.100.89:/media/brick0/brick/ndslabs 192.168.100.156:/media/brick0/br= ick/ndslabs force volume create: ndslabs: success: please start the volume to access data
The alternative solution would be to delete / recreate the mount point:<= /p>
root@wi= llis-k8-test-gluster:/# rm -rf /path/to/brick0 root@willis-k8-test-gluster:/# mkdir -p /path/to/brick0
Now that we have created our volume, we must start it in order for clien= ts to mount it:
root@wi= llis-k8-test-gluster:/# gluster volume start ndslabs volume start: ndslabs: success
Our volume is now being served out to the cluster over NFS, and we are r= eady for our clients to mount the volume.
Suppose we have a simple replicated gluster volume with 2 bricks, and we= are running low on space... we want to expand the storage it contains:
# On th= e host node, via SSH core@workshop1-node1 ~ $ df Filesystem 1K-blocks Used Available Use% Mounted on devtmpfs 16460056 0 16460056 0% /dev tmpfs 16476132 0 16476132 0% /dev/shm tmpfs 16476132 1872 16474260 1% /run tmpfs 16476132 0 16476132 0% /sys/fs/cgroup /dev/vda9 38216204 256716 36301140 1% / /dev/mapper/usr 1007760 639352 316392 67% /usr tmpfs 16476132 17140 16458992 1% /tmp tmpfs 16476132 0 16476132 0% /media /dev/vda1 130798 39292 91506 31% /boot /dev/vda6 110576 64 101340 1% /usr/share/oem /dev/vdb 41922560 6023596 35898964 15% /var/lib/docker /dev/vdc 10475520 626268 9849252 6% /media/storage /dev/vdd 104806400 49157880 55648520 47% /media/brick0 192.168.100.122:global 104806400 87618944 17187456 84% /var/glfs/global tmpfs 3295224 0 3295224 0% /run/user/500 /dev/vde 209612800 32928 209579872 1% /media/brick1 # Inside of the GLFS server pod root@workshop1-node1:/# gluster volume info global =20 Volume Name: global Type: Replicate Volume ID: ca59a98e-c959-454e-8ac3-9082b0ed2856 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 =3D 2 Transport-type: tcp Bricks: Brick1: 192.168.100.122:/media/brick0/brick Brick2: 192.168.100.116:/media/brick0/brick Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet features.quota: on features.inode-quota: on features.quota-deem-statfs: on
Provision and attach a new OpenStack volume to your existing instance, t= hen format it with XFS:
core@wo= rkshop1-node1 ~ $ sudo mkfs -t xfs /dev/vde meta-data=3D/dev/vde isize=3D256 agcount=3D4, agsize=3D131= 07200 blks =3D sectsz=3D512 attr=3D2, projid32bit=3D1 =3D crc=3D0 finobt=3D0 data =3D bsize=3D4096 blocks=3D52428800, imaxpc= t=3D25 =3D sunit=3D0 swidth=3D0 blks naming =3Dversion 2 bsize=3D4096 ascii-ci=3D0 ftype=3D0 log =3Dinternal log bsize=3D4096 blocks=3D25600, version= =3D2 =3D sectsz=3D512 sunit=3D0 blks, lazy-coun= t=3D1 realtime =3Dnone extsz=3D4096 blocks=3D0, rtextents=3D0=
You will then need to build up a *.mount file as below:
$ vi me= dia-brick1.mount [Unit] Description=3DMount OS_DEVICE on MOUNT_PATH After=3Dlocal-fs.target [Mount] What=3DOS_DEVICE Where=3DMOUNT_PATH Type=3DFS_TYPE Options=3Dnoatime [Install] WantedBy=3Dmulti-user.target
where:
Place this file in /etc/systemd/system/
Finally, start and enable your service to mount the volume to CoreOS and= ensure it is remounted on restart:
sudo mv= media-brick1.mount /etc/systemd/system/media-brick1.mount sudo systemctl daemon-reload sudo systemctl start media-brick1.mount sudo systemctl enable media-brick1.mount sudo systemctl unmask media-brick1.mount
You will need to perform the above steps on each of your GLFS servers befo=
re continuing
Now you'll need to exec into one of the GLFS server pods and perform the= following:
# Peer = probe the other IP in the cluster (gluster service IP also seems to work) $ gluster peer probe 10.254.202.236=20 peer probe: success. Host 192.168.100.1 port 24007 already in peer list # This one fails because we did not include our new brick's second replica $ gluster volume add-brick global 192.168.100.122:/media/brick1 = = =20 volume add-brick: failed: Incorrect number of bricks supplied 1 with count = 2 # This one fails because we need a sub-directory of the mount point $ gluster volume add-brick global 192.168.100.122:/media/brick1 192.168.100= .116:/media/brick1=20 volume add-brick: failed: The brick 192.168.100.116:/media/brick1 is a moun= t point. Please create a sub-directory under the mount point and use that a= s the brick directory. Or use 'force' at the end of the command if you want= to override this behavior. # This one works! :D $ gluster volume add-brick global 192.168.100.122:/media/brick1/brick 192.1= 68.100.116:/media/brick1/brick volume add-brick: success
And now we can see that our new brick has been added to the existing vol= ume:
core@wo= rkshop1-node1 ~ $ df Filesystem 1K-blocks Used Available Use% Mounted on devtmpfs 16460056 0 16460056 0% /dev tmpfs 16476132 0 16476132 0% /dev/shm tmpfs 16476132 1792 16474340 1% /run tmpfs 16476132 0 16476132 0% /sys/fs/cgroup /dev/vda9 38216204 256736 36301120 1% / /dev/mapper/usr 1007760 639352 316392 67% /usr tmpfs 16476132 17140 16458992 1% /tmp tmpfs 16476132 0 16476132 0% /media /dev/vda1 130798 39292 91506 31% /boot /dev/vda6 110576 64 101340 1% /usr/share/oem /dev/vdb 41922560 6023732 35898828 15% /var/lib/docker /dev/vdc 10475520 626360 9849160 6% /media/storage /dev/vdd 104806400 49157820 55648580 47% /media/brick0 192.168.100.122:global 314419200 49191424 265227776 16% /var/glfs/global tmpfs 3295224 0 3295224 0% /run/user/500 /dev/vde 209612800 33088 209579712 1% /media/brick1 root@workshop1-node1:/# gluster volume info global =20 Volume Name: global Type: Distributed-Replicate Volume ID: ca59a98e-c959-454e-8ac3-9082b0ed2856 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 =3D 4 Transport-type: tcp Bricks: Brick1: 192.168.100.122:/media/brick0/brick Brick2: 192.168.100.116:/media/brick0/brick Brick3: 192.168.100.122:/media/brick1/brick Brick4: 192.168.100.116:/media/brick1/brick Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet features.quota: on features.inode-quota: on features.quota-deem-statfs: on
Create the gluster-client DaemonSet using kubectl:
kubectl= create -f kubernetes/gluster-client-ds.yaml
This spec runs the ndslabs/gluster container in "client mode" on Kuberne= tes nodes labeled with ndslabs-role=3Dcompute.
Once each client container starts, it will mount the GlusterFS volume to= each compute host using NFS.
Once the clients are online, we can run a simple test of GlusterFS to en= sure that it is correctly serving and synchronizing the volume.
From the Kubernetes master, run the following command to see which nodes= are running the glusterfs-client containers:
core@wi= llis8-k8-test-1 ~ $ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE NODE coffee-rc-4u3pb 1/1 Running 0 12d 192.1= 68.100.65 coffee-rc-5m4t6 1/1 Running 0 12d 192.1= 68.100.65 default-http-backend-y98iw 1/1 Running 0 23h 192.1= 68.100.64 glusterfs-client-4hm9y 1/1 Running 0 5d 192.1= 68.100.65 glusterfs-client-6c12y 1/1 Running 0 5d 192.1= 68.100.66 glusterfs-server-hh5rm 1/1 Running 0 5d 192.1= 68.100.156 glusterfs-server-zoefs 1/1 Running 0 5d 192.1= 68.100.89 ndslabs-apiserver-zqgj8 1/1 Running 0 1d 192.1= 68.100.66 ndslabs-gui-p0hjh 1/1 Running 0 23h 192.1= 68.100.66 nginx-ilb-rc-x853y 1/1 Running 0 6d 192.1= 68.100.64 tea-rc-8saiu 1/1 Running 0 12d 192.1= 68.100.65 tea-rc-t403k 1/1 Running 0 12d 192.1= 68.100.65
Create two SSH sessions - one into each compute node (in this case, 192.= 168.100.65 and 192.168.100.66).
In one SSH session, run a BusyBox image mounted with our shared volume:<= /p>
docker = run -v /var/glfs:/var/glfs --rm -it busybox
Inside of the BusyBox container, create a test file:
echo "t= esting!" > /var/glfs/ndslabs/test.file
On the other machine, test that mapping the same directory into BusyBox = we can see the changes from the first host:
docker = run -v /var/glfs:/var/glfs --rm -it busybox
Running an ls on /var/glfs/ndslabs/= should show the test file created on the other node:
ls -al = /var/glfs/ndslabs
This proves that we can mount via NFS onto each node, map the NFS mount = into containers, and allow those containers to ingest or modify the data fr= om the NFS mount.