-
Bug
-
Resolution: Fixed
-
Normal
-
None
-
None
-
None
-
NDS Sprint 30, NDS Sprint 32
The error encountered was:
ErrImagePull: "net/http: request canceled"
I wonder if this timeout value is configurable, or if it's hard-coded? We might consider raising the timeout, but I would much prefer to look into whatever is causing the network slowness to begin with.
A few theories have surfaced:
- it could be MTU-related, as we've seen previously
- this could be a misconfiguration of our deployments resulting from misunderstanding the "layer-cake" of volumes and FS types
We have seen this occur in several scenarios:
- multi-node on OpenStack volumes with underlying XFS
- single-node without an OpenStack volume at all
As i recall, changing our xfs docker volume to ext4 was a slight improvement, but that doesn't explain the shoddy performance on a single-node installation when no are volumes involved. Perhaps we should default to this going forward? More discussion will certainly be needed.
Should we just abandon CoreOS? This is a reasonable solution.