Uploaded image for project: 'National Data Service'
  1. National Data Service
  2. NDS-1125

API server errors and restarts while trying to shutdown inactive service

XMLWordPrintableJSON

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • Backend
    • None
    • NDS Sprint 42, NDS Sprint 43

      I am seeing the following in the logs of the apiserver:

      core@my_vm01 ~/ndslabs/gui $ kubectl logs ndslabs-apiserver-2rb9m --previous
      Cloning into '/specs'...
      Cloned master https://github.com/nds-org/ndslabs-specs.git
      I1208 21:55:19.995954      18 server.go:109] Connecting to etcd on 10.0.0.18:4001
      I1208 21:55:20.201375      18 server.go:115] Connected to etcd
      I1208 21:55:20.201399      18 server.go:117] Connecting to Kubernetes API https://10.0.0.1:443
      I1208 21:55:20.493665      18 server.go:124] Connected to Kubernetes
      I1208 21:55:20.493782      18 server.go:163] Starting Workbench API server (1.0.13  2017-12-08 21:04)
      I1208 21:55:20.493827      18 server.go:164] Using etcd 10.0.0.18:4001 
      I1208 21:55:20.493864      18 server.go:165] Using kube-apiserver https://10.0.0.1:443
      I1208 21:55:20.493890      18 server.go:166] Using ome volume global
      I1208 21:55:20.493915      18 server.go:167] Using specs dir /specs
      I1208 21:55:20.493941      18 server.go:168] Listening on port 30001
      I1208 21:55:20.493985      18 server.go:177] prefix /api/
      I1208 21:55:20.516567      18 server.go:180] CORS origin https://www.cis.ndslabs.org
      I1208 21:55:20.516705      18 server.go:201] session timeout 30m0s
      I1208 21:55:20.522947      18 server.go:203] domain cis.ndslabs.org
      I1208 21:55:20.523404      18 server.go:204] ingress LoadBalancer
      I1208 21:55:20.523813      18 server.go:342] Loading service specs from /specs
      I1208 21:55:21.813244      18 server.go:361] Listening on 30001
      I1208 21:55:21.826336      18 server.go:3241] Stopping stack sb07xq for lambert8 due to inactivity
      panic: runtime error: invalid memory address or nil pointer dereference
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x424bb1]
       
      goroutine 168 [running]:
      panic(0x103dd00, 0xc420018060)
      	/usr/local/go/src/runtime/panic.go:500 +0x1a1
      main.(*Server).stopStack(0xc420094a50, 0xc420355480, 0x8, 0xc42021c048, 0x6, 0x0, 0xc4200203d8, 0x45ddfa)
      	/go/src/github.com/ndslabs/apiserver/cmd/server/server.go:2394 +0x481
      main.(*Server).shutdownInactiveServices(0xc420094a50)
      	/go/src/github.com/ndslabs/apiserver/cmd/server/server.go:3242 +0x45a
      created by main.(*Server).start
      	/go/src/github.com/ndslabs/apiserver/cmd/server/server.go:356 +0x43f5
      

      The stack for the problematic service can be found below:

      / # etcdctl get /ndslabs/accounts/lambert8/stacks/sb07xq
      {"id":"sb07xq","key":"clowder","name":"Clowder","services":[{"id":"sb07xq-clowder","stack":"clowder","service":"clowder","imageTag":"","status":"starting","statusMessages":["Reason=, Message=","Reason=Scheduled, Message=Successfully assigned sb07xq-clowder-19l0w to 127.0.0.1","Reason=Pulling, Message=pulling image \"clowder/clowder:1.1.1\"","Reason=Pulled, Message=Successfully pulled image \"clowder/clowder:1.1.1\"","Reason=Created, Message=Created container with docker id bc5cb2f0668b; Security:[seccomp=unconfined]","Reason=Started, Message=Started container with docker id bc5cb2f0668b","Reason=Unhealthy, Message=Readiness probe failed: Get http://172.17.0.12:9000/assets/stylesheets/main.css: dial tcp 172.17.0.12:9000: getsockopt: connection refused","Reason=Unhealthy, Message=Readiness probe failed: Get http://172.17.0.12:9000/assets/stylesheets/main.css: dial tcp 172.17.0.12:9000: getsockopt: connection refused","Reason=Unhealthy, Message=Readiness probe failed: Get http://172.17.0.12:9000/assets/stylesheets/main.css: dial tcp 172.17.0.12:9000: getsockopt: connection refused","Reason=Unhealthy, Message=Readiness probe failed: Get http://172.17.0.12:9000/assets/stylesheets/main.css: dial tcp 172.17.0.12:9000: getsockopt: connection refused","Reason=Unhealthy, Message=Readiness probe failed: Get http://172.17.0.12:9000/assets/stylesheets/main.css: dial tcp 172.17.0.12:9000: getsockopt: connection refused","Reason=Unhealthy, Message=Readiness probe failed: Get http://172.17.0.12:9000/assets/stylesheets/main.css: dial tcp 172.17.0.12:9000: getsockopt: connection refused","Reason=Unhealthy, Message=Readiness probe failed: Get http://172.17.0.12:9000/assets/stylesheets/main.css: dial tcp 172.17.0.12:9000: getsockopt: connection refused"],"endpoints":[{"host":"sb07xq-clowder.local.ndslabs.org","port":9000,"nodePort":0,"protocol":"http","path":"/","url":"sb07xq-clowder.local.ndslabs.org/"}],"createdTime":0,"updateTime":0,"config":{"ELASTICSEARCH_CLUSTERNAME":"clowder","RABBITMQ_EXCHANGE":"clowder","SMTP_HOST":"outbound.ucsd.edu","TOOLMANAGER_URI":"http://localhost:8082"},"volumeMounts":null,"internalIP":"10.0.0.154"},{"id":"sb07xq-mongo","stack":"clowder","service":"mongo","imageTag":"","status":"ready","statusMessages":["Reason=, Message=","Reason=SuccessfulCreate, Message=Created pod: sb07xq-mongo-kn5rm","Reason=Scheduled, Message=Successfully assigned sb07xq-mongo-kn5rm to 127.0.0.1","Reason=Pulling, Message=pulling image \"mongo:3.2.4\"","Reason=Pulled, Message=Successfully pulled image \"mongo:3.2.4\"","Reason=Created, Message=Created container with docker id cd5b8450156a; Security:[seccomp=unconfined]","Reason=Started, Message=Started container with docker id cd5b8450156a"],"endpoints":[{"host":"","port":27017,"nodePort":0,"protocol":"tcp","path":"","url":""}],"createdTime":0,"updateTime":0,"config":{},"volumeMounts":{"AppData/sb07xq-mongo-b8s9f":"/data/db"},"internalIP":"10.0.0.29"}],"status":"stopping","createdTime":0,"updateTime":0,"secure":false}
      

      This ticket is complete when the above error no longer occurs.

              Unassigned Unassigned
              lambert8 Sara Lambert
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: