Uploaded image for project: 'National Data Service'
  1. National Data Service
  2. NDS-705

Kubernetes services/ingress are sometimes not cleaned up properly on delete

XMLWordPrintableJSON

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Minor Minor
    • None
    • None
    • Backend, Infrastructure
    • None
    • NDS Sprint 19, NDS Sprint 21

      Logged into my test node (where I did the bulk of the Selenium testing) and, despite having deleted all of my applications, there are many lingering services left in Kubernetes:

      core@lambert8-dev ~ $ kubectl get svc --namespace=lambert8
      NAME                    CLUSTER-IP   EXTERNAL-IP   PORT(S)              AGE
      s3mcak-rabbitmq         10.0.0.44    <none>        5672/TCP,15672/TCP   19d
      s48qeo-rabbitmq         10.0.0.218   <none>        5672/TCP,15672/TCP   18d
      s7txia-rabbitmq         10.0.0.39    <none>        5672/TCP,15672/TCP   19d
      sa1l1c-rabbitmq         10.0.0.88    <none>        5672/TCP,15672/TCP   19d
      sahwe4-rabbitmq         10.0.0.120   <none>        5672/TCP,15672/TCP   19d
      sbvgh5-rabbitmq         10.0.0.61    <none>        5672/TCP,15672/TCP   18d
      scjwha-rabbitmq         10.0.0.47    <none>        5672/TCP,15672/TCP   17d
      scsxiq-rabbitmq         10.0.0.246   <none>        5672/TCP,15672/TCP   17d
      sii2gy-rabbitmq         10.0.0.214   <none>        5672/TCP,15672/TCP   19d
      sjh1ry-rabbitmq         10.0.0.208   <none>        5672/TCP,15672/TCP   19d
      skl5hm-cloudbrowserui   10.0.0.72    <none>        80/TCP               19d
      sl21j2-rabbitmq         10.0.0.163   <none>        5672/TCP,15672/TCP   19d
      ssblma-rabbitmq         10.0.0.250   <none>        5672/TCP,15672/TCP   19d
      sw3ulv-rabbitmq         10.0.0.73    <none>        5672/TCP,15672/TCP   18d
      szn5g5-rabbitmq         10.0.0.235   <none>        5672/TCP,15672/TCP   19d
      szraf2-rabbitmq         10.0.0.114   <none>        5672/TCP,15672/TCP   19d

      This is likely due to the e2e test suite creating and deleting them in rapid succession, but that is behavior we would like to correct if we should desire to scale this platform up any further.

      This feels like a thread problem (locking issue, race condition, etc), but I am not as familiar as I am in Java with Go and how it handles such things at a low level.

      This ticket is complete when the above behavior is diagnosed and the potential damages mitigated to the best of our ability.

              willis8 Craig Willis
              lambert8 Sara Lambert
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved:

                  Estimated:
                  Original Estimate - 2 hours
                  2h
                  Remaining:
                  Time Spent - 1 hour, 45 minutes Remaining Estimate - 15 minutes
                  15m
                  Logged:
                  Time Spent - 1 hour, 45 minutes Remaining Estimate - 15 minutes
                  1h 45m