Uploaded image for project: 'BrownDog'
  1. BrownDog
  2. BD-334

Monitor and restart essential software that Medici2 depends on

XMLWordPrintableJSON

      While uploading files to medici2 on dts1, Rui observed multiple times (on 2014-10-22, 2014-10-30, 2014-11-26) that the sent data did not appear in the RabbitMQ queues at all. In medici2 log, found that it lost the connection with RabbitMQ. Rui restarted medici2, and the connection was re-estabilished, and medici2 started fine.

      Per 2014-12-08 BD technical meeting, we need to monitor the essential services that Medici2 depends on, such as RabbitMQ and possibly MongoDB, to ensure that Medici2 is running as expected.

      An example excerpt from the medici2.log file showing the error msgs is below, and the entire file showing this and the restart on 2014-11-26 is attached.

      [[[37minfo[[0m] application -

      {"flags":"","intermediateId":"54760033e4b0387e744139f8","host":"http://dts1.ncsa.illinois.edu:9000","datasetId":"","id":"54760033e4b0387e744139f8","fileSize":"198473","secretKey":"xxxxxx"}

      [ERROR] [11/26/2014 10:30:44.282] [application-akka.actor.default-dispatcher-76] [akka://application/user/$a] clean connection shutdown; reason: Attempt to use closed channel
      com.rabbitmq.client.AlreadyClosedException: clean connection shutdown; reason: Attempt to use closed channel
      at com.rabbitmq.client.impl.AMQChannel.ensureIsOpen(AMQChannel.java:190)
      at com.rabbitmq.client.impl.AMQChannel.transmit(AMQChannel.java:291)
      at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:634)
      at com.rabbitmq.client.impl.ChannelN.basicPublish(ChannelN.java:617)
      at services.SendingActor$$anonfun$receive$1.applyOrElse(RabbitmqPlugin.scala:224)
      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
      at akka.actor.ActorCell.invoke(ActorCell.scala:456)
      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
      at akka.dispatch.Mailbox.run(Mailbox.scala:219)
      at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
      at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

      [[[37minfo[[0m] application -

      {"flags":"","intermediateId":"54760034e4b0387e744139fc","host":"http://dts1.ncsa.illinois.edu:9000","datasetId":"","id":"54760034e4b0387e744139fc","fileSize":"198473","secretKey":"xxxxxx"}

      [ERROR] [11/26/2014 10:30:44.918] [application-akka.actor.default-dispatcher-76] [akka://application/user/$a] clean connection shutdown; reason: Attempt to use closed channel
      com.rabbitmq.client.AlreadyClosedException: clean connection shutdown; reason: Attempt to use closed channel
      at com.rabbitmq.client.impl.AMQChannel.ensureIsOpen(AMQChannel.java:190)
      at com.rabbitmq.client.impl.AMQChannel.transmit(AMQChannel.java:291)
      ...

              Unassigned Unassigned
              ruiliu Rui Liu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: