...
Minimum Version: Kubernetes v1.7 or higher is supported by Rook.
If you are using dataDirHostPath to persist rook data on kubernetes hosts, make sure your host You will need to choose a hostPath
for the dataDirHostPath
(ensure that this has at least 5GB of free space available on the specified path).
You will also need to set up RBAC, and ensure that the Flex volume plugin has been configured.
Set the dataDirHostPath
If you are using dataDirHostPath
to persist Rook data on Kubernetes hosts, make sure your host has at least 5GB of space available on the specified path.
Setting up RBAC
On Kubernetes 1.7+, you will need to configure Rook to use RBAC appropriately.
See https://rook.github.io/docs/rook/master/rbac.html
...
Flex Volume Configuration
The Rook agent requires setup as a Flex volume plugin to manage the storage attachments in your cluster. See the Flex Volume Configuration topic to configure your Kubernetes deployment to load the Rook volume plugin.
...
- Log Collection
- OSD Information
- Separate Storage Groups
- Configuring Pools
- Custom ceph.conf Settings
- OSD CRUSH Settings
- Phantom OSD Removal
...
Debugging
See https://github.com/rook/rook/blob/master/Documentation/toolbox.md for help debugging
Cluster Teardown
See https://rook.github.io/docs/rook/master/teardown.html for thorough steps on destroying / cleaning up your Rook cluster
...
Code Block | ||
---|---|---|
| ||
apiVersion: v1 kind: ReplicationControllerService metadata: name: kube-registry-v0deployment-demo spec: namespaceports: kube -system port: labels:80 k8s-appprotocol: kube-registryTCP versionselector: v0 kubernetes.io/cluster-servicedemo: "true" specdeployment --- apiVersion: extensions/v1beta1 kind: Deployment metadata: replicasname: 3deployment-demo spec: selector: k8s-appmatchLabels: kube-registry version: v0 demo: deployment replicas: 2 templatestrategy: metadatarollingUpdate: labelsmaxSurge: 1 k8s-appmaxUnavailable: kube-registry0 type: RollingUpdate versiontemplate: v0 kubernetes.io/cluster-service: "true" metadata: spec: containerslabels: - namedemo: registrydeployment imageversion: registry:2v1 resourcesspec: limitscontainers: cpu- name: 100mbusybox memoryimage: 100Mibusybox envcommand: [ - name: REGISTRY_HTTP_ADDR value: :5000"sh", "-c", "while true; do echo $(hostname) v1 > /data/index.html; sleep 60; done" ] volumeMounts: - name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORYcontent valuemountPath: /var/lib/registrydata - name: volumeMounts:nginx - nameimage: image-storenginx mountPath: /var/lib/registry volumeMounts: ports: - containerPortname: 5000content namemountPath: registry /usr/share/nginx/html protocolreadOnly: TCPtrue volumes: - name: image-storecontent flexVolume: driver: rook.io/rook fsType: ceph options: fsName: myfs # name of the filesystem specified in the filesystem CRD. clusterNamespace: rook # namespace where the Rook cluster is deployed clusterName: rook # bynamespace defaultwhere the Rook pathcluster is /,deployed but you can override and mount a specific path of the filesystem# by default the path is /, but you can override and mount a specific path of the filesystem by using the path attribute # path: /some/path/inside/cephfs |
You now have a docker registry which is HA with persistent storage.NOTE: I had to explicitly specify clusterName
in the YAML above... newer versions of Rook will fallback to clusterNamespace
Kernel Version Requirement
If the Rook cluster has more than one filesystem and the application pod is scheduled to a node with kernel version older than 4.7, inconsistent results may arise since kernels older than 4.7 do not support specifying filesystem namespaces.
Testing Shared Storage
After creating our above example, we should now have 2 pods each with 2 containers running on 2 separate nodes:
Code Block | ||
---|---|---|
| ||
ubuntu@mldev-master:~$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
deployment-demo-c59b896c8-8jph6 2/2 Running 0 33m 10.244.1.6 mldev-storage0
deployment-demo-c59b896c8-fnp49 2/2 Running 0 33m 10.244.3.7 mldev-worker0
rook-agent-8c74k 1/1 Running 0 4h 192.168.0.3 mldev-storage0
rook-agent-bl5sr 1/1 Running 0 4h 192.168.0.4 mldev-worker0
rook-agent-clxfl 1/1 Running 0 4h 192.168.0.5 mldev-storage1
rook-agent-gll69 1/1 Running 0 4h 192.168.0.6 mldev-master
rook-operator-7db5d7b9b8-svmfk 1/1 Running 0 4h 10.244.0.5 mldev-master |
The nginx containers mount the shared filesystem read-only into /usr/share/nginx/html/
The busybox containers mount the shared filesystem read-write into /data/
To test that everything is working, we can exec into one busybox container and modify a file:
Code Block | ||
---|---|---|
| ||
# Exec into one of our busybox containers in a pod
ubuntu@mldev-master:~$ kubectl exec -it deployment-demo-c59b896c8-fnp49 -c busybox -- sh
# Create a new file
/ # ls -al /data/
total 4
drwxr-xr-x 1 root root 0 May 22 21:28 .
drwxr-xr-x 1 root root 4096 May 22 21:26 ..
-rw-r--r-- 1 root root 35 May 22 22:03 index.html
/ # echo '<div>Hello, World!</div>' > /data/testing.html
# Ensure that the file was created successfully
/ # cat /data/testing.html
<div>Hello, World!</div> |
Make sure that this appeared in the nginx container within the same pod:
Code Block | ||
---|---|---|
| ||
# Exec into one of the same pod's nginx container (as a sanity check)
ubuntu@mldev-master:~$ kubectl exec -it deployment-demo-c59b896c8-fnp49 -c nginx -- sh
# Ensure that the file we created previously exists
/ # ls -al /usr/share/nginx/html/
total 5
drwxr-xr-x 1 root root 0 May 22 21:28 .
drwxr-xr-x 1 root root 4096 May 22 21:26 ..
-rw-r--r-- 1 root root 35 May 22 22:03 index.html
-rw-r--r-- 1 root root 25 May 22 21:28 testing.html
# Ensure that the file has the correct contents
/ # cat /usr/share/nginx/html/testing.html
<div>Hello, World!</div> |
Perform the same steps in both containers in the other pod:
Code Block | ||
---|---|---|
| ||
# Verify the same file contents in both containers of the other pod (on another node)
ubuntu@mldev-master:~$ kubectl exec -it deployment-demo-c59b896c8-8jph6 -c busybox -- cat /data/testing.html
<div>Hello, World!</div>
ubuntu@mldev-master:~$ kubectl exec -it deployment-demo-c59b896c8-8jph6 -c nginx -- cat /usr/share/nginx/html/testing.html
<div>Hello, World!</div> |
If all of the file contents match, then congratulations!
You have just set up your first shared filesystem under Rook!
Under the Hood
For more information on the low-level processes involved in the above example, see https://github.com/rook/rook/blob/master/design/filesystem.md
After running our above example, we can SSH into the storage0 and worker0 nodes to get a better sense of where Rook stores its data.
Our current confusion seems to come from some hard-coded values in Zonca's deployment of rook-cluster.yaml
:
Code Block | ||
---|---|---|
| ||
apiVersion: v1
kind: Namespace
metadata:
name: rook
---
apiVersion: rook.io/v1alpha1
kind: Cluster
metadata:
name: rook
namespace: rook
spec:
versionTag: v0.6.2
dataDirHostPath: /var/lib/rook <------ this seems to be important
storage:
useAllNodes: true
useAllDevices: false
storeConfig:
storeType: bluestore
databaseSizeMB: 1024
journalSizeMB: 1024
directories:
- path: "/vol_b" |
The dataDirHostPath
above is the same one mentioned in the quick start and at the top of this document - this seems to tell Rook where on the host it should persist its configuration.
The directories
section is supposed to list the paths that will be included in the storage cluster. (Note that using two directories on the same physical device can cause a negative performance impact.)
Investigating Storage directories
Checking the logs for one of the Rook agents, we can see a success message shows us where the data really lives:
Code Block | ||
---|---|---|
| ||
# Check logs of rook-agent running on mldev-storage0
ubuntu@mldev-master:~$ kubectl logs -f rook-agent-8c74k
2018-05-22 17:37:46.340923 I | rook: starting Rook v0.6.2 with arguments '/usr/local/bin/rook agent'
2018-05-22 17:37:46.340990 I | rook: flag values: --help=false, --log-level=INFO
2018-05-22 17:37:46.341634 I | rook: starting rook agent
2018-05-22 17:37:47.963857 I | exec: Running command: modinfo -F parm rbd
2018-05-22 17:37:48.451205 I | exec: Running command: modprobe rbd single_major=Y
2018-05-22 17:37:48.945393 I | rook-flexvolume: Rook Flexvolume configured
2018-05-22 17:37:48.945572 I | rook-flexvolume: Listening on unix socket for Kubernetes volume attach commands.
2018-05-22 17:37:48.947679 I | opkit: start watching cluster resource in all namespaces at v1alpha1
2018-05-22 21:03:02.504533 I | rook-flexdriver: mounting ceph filesystem myfs on /var/lib/kubelet/pods/83ad3107-5e03-11e8-b20b-fa163e9f32d5/volumes/rook.io~rook/content
2018-05-22 21:03:02.589501 I | rook-flexdriver: mounting ceph filesystem myfs on /var/lib/kubelet/pods/83b0ed53-5e03-11e8-b20b-fa163e9f32d5/volumes/rook.io~rook/content
2018-05-22 21:03:03.084507 I | rook-flexdriver: mounting ceph filesystem myfs on /var/lib/kubelet/pods/83ad3107-5e03-11e8-b20b-fa163e9f32d5/volumes/rook.io~rook/content
2018-05-22 21:03:03.196658 I | rook-flexdriver: mounting ceph filesystem myfs on /var/lib/kubelet/pods/83b0ed53-5e03-11e8-b20b-fa163e9f32d5/volumes/rook.io~rook/content
2018-05-22 21:03:04.188182 I | rook-flexdriver: mounting ceph filesystem myfs on /var/lib/kubelet/pods/83ad3107-5e03-11e8-b20b-fa163e9f32d5/volumes/rook.io~rook/content
... ... ... ... ... ... ... ... ... ...
2018-05-22 21:26:46.887549 I | cephmon: parsing mon endpoints: rook-ceph-mon0=10.101.163.226:6790,rook-ceph-mon1=10.107.75.148:6790,rook-ceph-mon2=10.101.87.159:6790
2018-05-22 21:26:46.887596 I | op-mon: loaded: maxMonID=2, mons=map[rook-ceph-mon0:0xc42028e980 rook-ceph-mon1:0xc42028e9c0 rook-ceph-mon2:0xc42028ea20]
2018-05-22 21:26:46.890128 I | rook-flexdriver: WARNING: The node kernel version is 4.4.0-31-generic, which do not support multiple ceph filesystems. The kernel version has to be at least 4.7. If you have multiple ceph filesystems, the result could be inconsistent
2018-05-22 21:26:46.890236 I | rook-flexdriver: mounting ceph filesystem myfs on 10.101.163.226:6790,10.107.75.148:6790,10.101.87.159:6790:/ to /var/lib/kubelet/pods/d40d5673-5e06-11e8-b20b-fa163e9f32d5/volumes/rook.io~rook/content
2018-05-22 21:26:49.446669 I | rook-flexdriver:
2018-05-22 21:26:49.446832 I | rook-flexdriver: ceph filesystem myfs has been attached and mounted
# SSH into mldev-storage0, elevate privileges, and check the specified sub-folder
ubuntu@mldev-master:~$ ssh ubuntu@192.168.0.3
ubuntu@mldev-storage0:~$ sudo su
root@mldev-storage0:/home/ubuntu# ls -al /var/lib/kubelet/pods/d40d5673-5e06-11e8-b20b-fa163e9f32d5/volumes/rook.io~rook/content
total 5
drwxr-xr-x 1 root root 0 May 22 21:28 .
drwxr-x--- 3 root root 4096 May 22 21:26 ..
-rw-r--r-- 1 root root 35 May 22 22:33 index.html
-rw-r--r-- 1 root root 25 May 22 21:28 testing.html
|
Obviously this is not where we want the shared filesystem data stored long-term, so I'll need to figure out why these files are persisted into /var/lib/kubelet
and not into the directories
specified in the Cluster configuration.
Digging Deeper into dataDirHostPath
Checking /var/lib/rook
directory, we see a few sub-directories:
Code Block | ||
---|---|---|
| ||
root@mldev-master:/var/lib/rook# ls -al
total 20
drwxr-xr-x 5 root root 4096 May 22 17:40 .
drwxr-xr-x 46 root root 4096 May 22 17:38 ..
drwxr--r-- 3 root root 4096 May 22 17:41 osd1
drwxr--r-- 2 root root 4096 May 22 17:40 rook
drwxr--r-- 3 root root 4096 May 22 17:38 rook-ceph-mon0
root@mldev-master:/var/lib/rook# ls -al rook
total 16
drwxr--r-- 2 root root 4096 May 22 17:40 .
drwxr-xr-x 5 root root 4096 May 22 17:40 ..
-rw-r--r-- 1 root root 168 May 22 17:40 client.admin.keyring
-rw-r--r-- 1 root root 1123 May 22 17:40 rook.config
root@mldev-master:/var/lib/rook# ls -al rook-ceph-mon0/
total 24
drwxr--r-- 3 root root 4096 May 22 17:38 .
drwxr-xr-x 5 root root 4096 May 22 17:40 ..
drwxr--r-- 3 root root 4096 May 22 17:38 data
-rw-r--r-- 1 root root 248 May 22 17:38 keyring
-rw-r--r-- 1 root root 210 May 22 17:38 monmap
-rw-r--r-- 1 root root 1072 May 22 17:38 rook.config
srwxr-xr-x 1 root root 0 May 22 17:38 rook-mon.rook-ceph-mon0.asok
root@mldev-master:/var/lib/rook# ls -al osd1
total 7896
drwxr--r-- 3 root root 4096 May 22 17:41 .
drwxr-xr-x 5 root root 4096 May 22 17:40 ..
lrwxrwxrwx 1 root root 34 May 22 17:40 block -> /var/lib/rook/osd1/bluestore-block
lrwxrwxrwx 1 root root 31 May 22 17:40 block.db -> /var/lib/rook/osd1/bluestore-db
lrwxrwxrwx 1 root root 32 May 22 17:40 block.wal -> /var/lib/rook/osd1/bluestore-wal
-rw-r--r-- 1 root root 2 May 22 17:40 bluefs
-rw-r--r-- 1 root root 74883726950 May 22 22:13 bluestore-block
-rw-r--r-- 1 root root 1073741824 May 22 17:41 bluestore-db
-rw-r--r-- 1 root root 603979776 May 22 22:14 bluestore-wal
-rw-r--r-- 1 root root 37 May 22 17:41 ceph_fsid
-rw-r--r-- 1 root root 37 May 22 17:40 fsid
-rw-r--r-- 1 root root 56 May 22 17:40 keyring
-rw-r--r-- 1 root root 8 May 22 17:40 kv_backend
-rw-r--r-- 1 root root 21 May 22 17:41 magic
-rw-r--r-- 1 root root 4 May 22 17:40 mkfs_done
-rw-r--r-- 1 root root 6 May 22 17:41 ready
-rw-r--r-- 1 root root 1693 May 22 17:40 rook.config
srwxr-xr-x 1 root root 0 May 22 17:41 rook-osd.1.asok
drwxr--r-- 2 root root 4096 May 22 17:40 tmp
-rw-r--r-- 1 root root 10 May 22 17:40 type
-rw-r--r-- 1 root root 2 May 22 17:41 whoami
|
As you can see, the configuration files do not appear to be readable on disk and would likely need to be un-mangled by Rook to properly perform a backup.