Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Bing's external monitor can't call Clowder, because it has to operate even when Clowder is down. Instead the monitors in different regions can collect and post their datapoints to the Flask API, which can go around Clowder into RabbitMQ directly.


We run a service as docker container periodically fetch the statistics data of Clowder service, e.g., the uptime, response time and a number of active connections to Clowder, etc. And those data will be stored in the backend services e.g., influxdata (this will need the extra endpoints of service), and grafana will retrieve those data and render them on the grafana website for the visualization.

The uptime of Clowder website can ensure we understand the liveness of Clowder service and this metric will be collected by sending ping to the target Clowder website with a certain timeout.

Response time: meanwhile, we collect the statistics of the response time of the ping command. And the elapsed time of downloading Clowder homepage.

The number of connections: It would be good to see how many connections to Clowder website. we can measure the number of connections within a period of time. We would analyze the Ngnix log to get those information.



Database monitor(s)

Finally, we need a service to actually pull the messages from RabbitMQ and write them into a database, whether that is MongoDB or InfluxDB or whatever. Maybe these could register with Clowder like extractors even, so that they each get a separate queue and multiples can log to different destinations at once.

...