...
Date/Time | What happened | How was it resolved |
---|---|---|
1/19/2018 | Disk space warnings gfs3 | The registry cache was using ~34GB disk. kubectl exec -it regsitry sh wget localhost:5001/v2/_catalog -O - (lists images in cache) cd /var/lib/registry/docker/registry/v2 find something that can be removed (e.g., repositories/craigwillis/apiserver) rm -r repositories/craigwillis/apiserver /bin/registry garbage-collect /etc/docker/registry/config.yml Deletes cached blobs |
1/8/2018 | transport connection errors | Started receiving alerts about exceeded pod restart thresholds for two mongo containers. Noticed I/O errors in mongo logs. Exec'd into Gluster server and noted that two bricks (node1, node2) were offline. Restarted both pods, one at a time. |
1/14/2018 | gfs4 load warnings | Ongoing load warning on gfs4. Noticed gfs2 brick not connected. Restarted gfs2 gluster server. Rebooted gfs4 node. Ran gluster volume heal global info gluster volume heal global to heal files |
...