Uploaded image for project: 'National Data Service'
  1. National Data Service
  2. NDS-705

Kubernetes services/ingress are sometimes not cleaned up properly on delete

    XMLWordPrintableJSON

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • None
    • Backend, Infrastructure
    • None
    • NDS Sprint 19, NDS Sprint 21

    Description

      Logged into my test node (where I did the bulk of the Selenium testing) and, despite having deleted all of my applications, there are many lingering services left in Kubernetes:

      core@lambert8-dev ~ $ kubectl get svc --namespace=lambert8
      NAME                    CLUSTER-IP   EXTERNAL-IP   PORT(S)              AGE
      s3mcak-rabbitmq         10.0.0.44    <none>        5672/TCP,15672/TCP   19d
      s48qeo-rabbitmq         10.0.0.218   <none>        5672/TCP,15672/TCP   18d
      s7txia-rabbitmq         10.0.0.39    <none>        5672/TCP,15672/TCP   19d
      sa1l1c-rabbitmq         10.0.0.88    <none>        5672/TCP,15672/TCP   19d
      sahwe4-rabbitmq         10.0.0.120   <none>        5672/TCP,15672/TCP   19d
      sbvgh5-rabbitmq         10.0.0.61    <none>        5672/TCP,15672/TCP   18d
      scjwha-rabbitmq         10.0.0.47    <none>        5672/TCP,15672/TCP   17d
      scsxiq-rabbitmq         10.0.0.246   <none>        5672/TCP,15672/TCP   17d
      sii2gy-rabbitmq         10.0.0.214   <none>        5672/TCP,15672/TCP   19d
      sjh1ry-rabbitmq         10.0.0.208   <none>        5672/TCP,15672/TCP   19d
      skl5hm-cloudbrowserui   10.0.0.72    <none>        80/TCP               19d
      sl21j2-rabbitmq         10.0.0.163   <none>        5672/TCP,15672/TCP   19d
      ssblma-rabbitmq         10.0.0.250   <none>        5672/TCP,15672/TCP   19d
      sw3ulv-rabbitmq         10.0.0.73    <none>        5672/TCP,15672/TCP   18d
      szn5g5-rabbitmq         10.0.0.235   <none>        5672/TCP,15672/TCP   19d
      szraf2-rabbitmq         10.0.0.114   <none>        5672/TCP,15672/TCP   19d

      This is likely due to the e2e test suite creating and deleting them in rapid succession, but that is behavior we would like to correct if we should desire to scale this platform up any further.

      This feels like a thread problem (locking issue, race condition, etc), but I am not as familiar as I am in Java with Go and how it handles such things at a low level.

      This ticket is complete when the above behavior is diagnosed and the potential damages mitigated to the best of our ability.

      Gliffy Diagrams

        Attachments

          Issue Links

            Activity

              People

                willis8 Craig Willis
                lambert8 Sara Lambert
                Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:

                  Time Tracking

                    Estimated:
                    Original Estimate - 2 hours
                    2h
                    Remaining:
                    Time Spent - 1 hour, 45 minutes Remaining Estimate - 15 minutes
                    15m
                    Logged:
                    Time Spent - 1 hour, 45 minutes Remaining Estimate - 15 minutes
                    1h 45m

                    Tasks