-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
None
-
None
-
NDS Sprint 16, NDS Sprint 18
This ticket has evolved from general issues with master to the identified problem of the number of messages stored in etcd.
This ticket is complete when we have a way to control the number of kubernetes status messages stored for a stack service.
—
Historic information
Starting on 10/3 and about once per week, all kubernetes services stop on master1 on the beta cluster. This prevents users from starting/stopping services, although existing services run without problem. Rebooting the node resolves the problem.
This best-effort task is to investigate the root cause, which is expected to be out of memory error for etcd, and identify and implement a solution, if possible or to create additional tickets as needed.