Uploaded image for project: 'National Data Service'
  1. National Data Service
  2. NDS-395

Starting Mongo with GFS quota enabled crashes GFS

XMLWordPrintableJSON

    • NDS Sprint 9, NDS Sprint 11

      Find the bug and create a new ticket to fix it.


      Overview:

      • Create a new cluster with deploy-tools:NDS-261 – this just enables GFS quota
      • Create a project via NDS Labs UI – this creates a quota for a directory
      • Launch mongo for the project
      • Error: transport endpoint is not connected
      • GFS is unusable until server pods are deleted/recreated

      Detailed test case

      Run deploy-tools NDS-261:

      docker run --name deploy-tools -v `pwd`/deploy-tools:/root/SAVED_AND_SENSITIVE_VOLUME  -it ndslabs/deploy-tools:NDS-261 bash

      Modify /usr/local/lib/ndslabs/ansible/roles/ndslabs-api-gui/templates/ndslabs-apiserver.yaml.j2 latest > NDS-261
      S

      Ssh into master node and run kubectl:

      ssh -i cwtest2.pem core@master
       
      kubectl exec glusterfs-server-globalfs-pod gluster volume quota global list
      quota: No quota configured on volume global

      Create project "nds261" via the NDS UI – this creates a quota for /nds261. Confirm via kubectl

      kubectl exec glusterfs-server-globalfs-fpvd0 gluster volume quota global list
                        Path                   Hard-limit Soft-limit   Used  Available  Soft-limit exceeded? Hard-limit exceeded?
      ---------------------------------------------------------------------------------------------------------------------------
      /nds261                                   10.0GB       80%      0Bytes  10.0GB              No                   No

      Tail the apiserver logs:

      kubectl logs-f ndslabs-apiserver-pod 

      Add service RabbitMB, launch RabbitMQ, and note that the service starts without error. Stop it and delete it.
      Add service iCAT, launch it and note that the service starts without error. Stop and delete it.

      Add service Mongo, launch mongo:

      Failed to start container with docker id fd1f173e892e with error: API error (500): Cannot start container fd1f173e892e6517c34a493182cdb66c3e2b42cc8686c7646ff44ed19670b905: stat /var/glfs/global/nds261: transport endpoint is not connected

      Launch iCAT again, you'll see the same error. kubectl delete the gluster server pods and you'll be able to start anything but Mongo normally.

              raila David Raila
              willis8 Craig Willis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved:

                  Estimated:
                  Original Estimate - 4 hours Original Estimate - 4 hours
                  4h
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 5 hours
                  5h