-
Bug
-
Resolution: Won't Fix
-
Normal
-
None
-
None
Found in dts-dev bi-hourly testing that on a cloud VM there were two OpenCV faces extractors, where the containers lost RabbitMQ connections, but were still running. We need to find a way to detect such idle extractors/converters and remove them.
Root cause and analysis: Clowder on dts-dev was unresponsive. The faces extractor received a msg, tried to download the file from dts-dev Clowder, but did not get a reply, so they waited there. After a while, the RabbitMQ connection was closed due to heartbeat. The container still ran, and occupied a slot, so the elasticity module would not start more containers on the VM.
Normal extractors have connections to the RabbitMQ server at port 5672:
ubuntu@dts-dev-docker-4:~$ docker exec -i -t opencv-faces-3 netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 172.17.0.3:57330 141.142.227.65:5672 ESTABLISHED
The extractors that lost RabbitMQ connection did not have a conn to port 5672, but were connected to Clowder at port 9000:
ubuntu@dts-dev-docker-4:~$ docker exec -i -t opencv-faces-2 netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 172.17.0.2:46349 141.142.227.82:9000 ESTABLISHED
- mentioned in
-
Page Loading...