Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Updated data items 6 and 7, and added "stopping extractor instance" in "Scale down" part

...

  1. an extractor is installed as a service on a VM, so when a VM starts, all the extractors that the VM contains as services will start automatically and successfully; if services are do not fulfill all requirements, we might have to look into alternatives;
  2. the resource limitation of using extractors to process input data is CPU processing, not memory, disk I/O, or network I/O, so the design is only for scaling for CPU usage;
  3. the system needs to support multiple OS types, including both Linux and Windows;
  4. the system uses RabbitMQ as the messaging technology.

...

  1. RabbitMQ queue lengths and the number of consumers for the queues;
    Can be obtained using RabbitMQ management API. The number of consumers can be used to verify that the action to scale up/down succeeded.
  2. for each queue, the corresponding extractor name;
    Currently hard coded in the extractor code, so that queue name == extractor name.
  3. for a given extractor, the list of running VMs where an instance of the extractor is running, and the list of suspended VMs where it was running;
    Running VM list: can be obtained using RabbitMQ management API, queue --> connections --> IP.
    Suspended VM list: when suspending a VM, update the mapping for the given extractor, remove the entry from the running VM list and add it to the suspended VM list.
    Also maintain the data of mapping from a running/suspended VM to the extractors that it contains. This is useful in the scaling up part.
  4. the number of vCPUs of the VMs;
    This info is fixed for a given OpenStack flavor. The flavor must be specified when starting a VM, and this data can be stored at that time.
  5. the load averages of the VMs;
    For Linux, can be obtained by executing a command ("uptime" or "cat /proc/loadavg") with ssh. Learned a way to Verified that using ssh connection multiplexing (SSH ControlMaster), we can get it quickly in <1 second, usually 0.3 second. But if If needed, can use a separate thread to get this data, instead of in-line in the execution flow.
  6. for a given extractor type, the list of VM images where the extractor is available, and the entire command line (including arguments) service name to start another extractor instance, i.e., a pair of (VM image name, entire command lineservice name). The command line service name is needed only for running additional extractor instances, since the first instance of that extractor will be started automatically as a service.
    Also maintain the data of mapping from a given VM image to the extractors it contains. This is useful in the scaling up part.
    This is manual and static data. Can be stored in a config file, a MongoDB collection, or using other ways.
  7. the last times a request is processed by the VMs, and in the queues.
    Can The VM part can be obtained using the RabbitMQ management API, /api/channels/: "idle_since" and "peer_host". Need to aggregate the channels that have the same peer_host IP, and skip the ones on the localhost. This info is used in the scaling down part for suspending a VM.
    The queue part can be obtained using the RabbitMQ management API, /api/queues: "idle_since". Used in the scaling down part for stopping extractor instances.

In the above data, items 2, 4 and 6 are static (or near static), the others are dynamic, changing at run time.

...

At the end of the above iterations, we could consider verifying whether the expected increase in the number of extractors actually occurred or not, and print the result out.

Scaling down:

    1. Stop idle extractor instances:
      Find out idle queues (no data / activity for a configurable period of time). For each such queue, find out the running VMs and the number of extractor instances. If the number of extractor instances is > 1, stop all of them, leaving the first instance running.

    2. Suspend idle VMs.

Get the list of IPs of the running VMs. Iterate through them:

...