Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

Order of Operations

  1. MongoDB must be running first
  2. Optional - If you intend on using search:
    1. Start ElasticSearch
  3. Next, start the RabbitMQ
  4. Then, start Clowder itself
  5. Optional: Finally, start Optional - If you intend on using extractors:
  6. Next, start the RabbitMQ
  7. Start any extractors that you want
    1. Finally, start Clowder itself

    Required Environment Variables

    Default / acceptable values are in paranethesis.

    These values are already specified inside of the controllers/*-controller.yaml files.

     

    Clowder

    The following environment variables MUST BE SET in order to start Clowder:

    • SMTP_HOST: (smtp.ncsa.illinois.edu) The SMTP server that Clowder should use to send e-mail
    • CLOWDER_BRANCH: (CATS-CORE0) Which branch should Clowder use to execute?
    • CLOWDER_BUILD: (latestSuccessful) Which build should Clowder use to execute?
    • CLOWDER_UPDATE: (NO) Whether or not to push updates back upstream
    • MONGO_URI: (mongodb://[user[:password]@]MONGO_SERVICE_IP_ADDR:PORT/clowder) The URI that should be used to connect to MongoDB
    • RABBITMQ_URI: (amqp://guest:guest@RABBIT_SERVICE_IP_ADDR:PORT/%2F) The URI that should be used to connect to RabbitMQ
    • RABBITMQ_EXCHANGE: (clowder) The name of the RabbitMQ exchange that should be used
    • RABBITMQ_MGMT_PORT: (15672) The port that should be used to access the management web UI for RabbitMQ

    Extractors

    The following environment variables MUST BE SET in order to start the extractors:

    • RABBITMQ_URI: (amqp://guest:guest@RABBIT_SERVICE_IP_ADDR:PORT/%2F) The URI that should be used to connect to RabbitMQ
    • RABBITMQ_EXCHANGE: (clowder) The name of the RabbitMQ exchange that should be used
    • RABBITMQ_VHOST: (%2F) The virtual host (URI segment) of RabbitMQ
    • RABBITMQ_QUEUE: (varies) The queue that this extractor should watch for work to do
    • RABBITMQ_PORT_5672_TCP_ADDR: (SERVICE_IP_ADDR) The address of RabbitMQ's message bus
    • RABBITMQ_PORT_5672_TCP_PORT: (5672) The port that RabbitMQ is using for its message bus
    • RABBITMQ_PORT_15672_TCP_ADDR: (SERVICE_IP_ADDR) The address of RabbitMQ's management interface
    • RABBITMQ_PORT_15672_TCP_PORT: (15672) The port that RabbitMQ is using for its management interface

    Bootstrapping Clowder in Kubernetes

    Step 0: Clone Source Repository

    git clone https://github.com/nds-org/nds-labs
    cd nds-labs/
    git fetch --all
    git checkout lambert-dev

    TODO: Once this is pushed up to master, remove the "checkout" step.

      1. You may need to give Clowder 10-15 minutes to discover this change.
    1. Verify that the extractors are registered to Clowder by navigating to: CLOWDER_URL:PORT/api/status
      1. You should see the extractors show up in their respective section after several minutes.

    Bootstrapping Clowder in Kubernetes

    Step 1: Build or Pull Required Images

    Start up the NDSDEV container using the following command:

    . ./devtools/ndsdev/ndsdevctl run

    Once inside of NDSDEV, enter the services/ncsa/clowder folder:

    cd /nds/src/services/ncsa/clowder

    Pulling the Images from Docker Hub

    From services/ncsa/clowder, run the following command to pull all required images:

    make all

    You should now see Docker pulling down several images. Once it is complete, you can move on to spinning up your cluster.

    Building the Images from Source

    WARNING: Building these images from source takes a VERY long time (~25 minutes). Pulling the images from docker hub is always preferred.

    From services/ncsa/clowder, execute the following commands to build the required images from source:

    make git-src
    make all-from-src 

    Step 2: Bring Up an Empty Cluster

    Setup an empty K8 cluster, as described here: Cluster Setup: Development (Local)

    Step 3: Bring Up All Services

    ...

    This step MUST be done before the controllers can be started. Create the 3 necessary services using the following command:

    kubectl create -f services/

    Why must this be done first?

    Starting these services creates a bunch of environment variables containing ip/port/protocol of the service that are then injected into controllers/pods that match the selector. Any pods started after these services will have these environment variables injected into their pod's environment. This allows us to set values in the controller specification(s) like this one, for example:

            - name: RABBITMQ_URI
              value: "amqp://guest:guest@$(CLOWDER_RABBITMQ_PORT_5672_TCP_ADDR):$(CLOWDER_RABBITMQ_PORT_5672_TCP_PORT)/%2F"

    Step 4: Bring Up Any Controllers (In Order)

    1.)  To bring up a MongoDB controller and pod, run the following command:

    kubectl create -f controllers/mongodb-controller.yaml

     

    2

    ...

    .)  (Optional) Run this command to start RabbitMQ:

    kubectl create -f controllers/rabbitmq-controller.yaml

     

    3.)  Run this command last to bring up Clowder itself:

    kubectl create -f controllers/clowder-controller.yaml

    Step 5: Clowder GUI

    Wait a minute or so for Clowder to start. Once it has started, you should be able to navigate your browser to port 30291 of your OpenStack VM's IP.

    Clowder's controller specification describes a "NodePort" (port 30291) that exposes our cluster to the public internet.

    Optional Plugins

    Extractors

    After starting RabbitMQ, you can run commands similar to the following in order to start your extractors:

    kubectl create -f controllers/EXTRACTOR-1-controller.yaml
    kubectl create -f controllers/EXTRACTOR-2-controller.yaml
       .   .   .
    kubectl create -f controllers/EXTRACTOR-N-controller.yaml

     

    NOTE: Be sure to give Clowder up to 15 minutes to discover the change unless you plan on restarting the Clowder pod.

    ElasticSearch (Text-based Search)

    WARNING: The "latest" v2.2.0 image of elasticsearch would not work with Clowder. We need to investigate the latest usable version. v1.3.9 worked without errors.

    kubectl create -f controllers/elasticsearch-controller.yaml

     

    To enable searching from the browser, you will need to install the "head" plugin for elasticsearch by running the following command:

    kubectl exec `kubectl get pods | grep lastic | awk '{print $1}'` -- /usr/share/elasticsearch/bin/plugin -i mobz/elasticsearch-head

    TERRA Tool Server

    More information to come...

    Stopping the Cluster

    To shut down the cluster, simply perform all steps above in reverse with delete instead of create.

    From the services/ncsa/clowder, execute the following commands to shut down the whole cluster:

    kubectl delete -f controllers/
    kubectl delete -f services/
    cd ../../..
    . ./cluster/k8s/localdev/kube-down-local.sh

    ...

    Tip and Tricks

    Scaling Extractors

    Use the following command to scale up or down an extractor on the fly:

    kubectl scale --current-replicas=1 --replicas=2 rc clowder-image-preview

    NOTE: --current-replicas is optional, but highly recommended when scaling down a controller.

    NOTE: --replicas=0 is an acceptable value, allowing you to start a controller without starting its pods right away.

    Hot-swapping Extractors

    Using the "Pseudo Pod Restart" technique requires a Replication Controller enforcing a particular pod configuration. Simply take the RC name as CONTROLLER_NAME and the current number of replicas as N and use the command below:

    kubectl scale --current-replicas=N --replicas=N rc 

    As you bring up and take down extractors, they will (every few minutes or so) automagically register / unregister themselves from Clowder. Based on the behavior, this is likely done via a polling process. 

    If you are trying to modify which extractors you are currently using, be sure to give Clowder up to 15 minutes to discover the changeNOTE: In order to avoid needlessly changing the current configuration, we recommend fixing --replicas == --current-replicas, but this need not be true if you intend to scale up the controller anyways.