Overview
Order of Operations
- MongoDB must be running first
- Optional - If you intend on using search:
- Start ElasticSearch?
- Optional - If you intend on using extractors:Next, start the RabbitMQ
- Then, start Clowder itself
- Optional: Finally, start any extractors that you want
- Finally, start Clowder itself
Required Environment Variables
Default / acceptable values are in paranethesis.
These values are already specified inside of the controllers/*-controller.yaml files.
Clowder
The following environment variables MUST BE SET in order to start Clowder:
- SMTP_HOST: (smtp.ncsa.illinois.edu) The SMTP server that Clowder should use to send e-mail
- CLOWDER_BRANCH: (CATS-CORE0) Which branch should Clowder use to execute?
- CLOWDER_BUILD: (latestSuccessful) Which build should Clowder use to execute?
- CLOWDER_UPDATE: (NO) Whether or not to push updates back upstream
- MONGO_URI: (mongodb://[user[:password]@]MONGO_SERVICE_IP_ADDR:PORT/clowder) The URI that should be used to connect to MongoDB
- RABBITMQ_URI: (amqp://guest:guest@RABBIT_SERVICE_IP_ADDR:PORT/%2F) The URI that should be used to connect to RabbitMQ
- RABBITMQ_EXCHANGE: (clowder) The name of the RabbitMQ exchange that should be used
- RABBITMQ_MGMT_PORT: (15672) The port that should be used to access the management web UI for RabbitMQ
Extractors
The following environment variables MUST BE SET in order to start the extractors:
- RABBITMQ_URI: (amqp://guest:guest@RABBIT_SERVICE_IP_ADDR:PORT/%2F) The URI that should be used to connect to RabbitMQ
- RABBITMQ_EXCHANGE: (clowder) The name of the RabbitMQ exchange that should be used
- RABBITMQ_VHOST: (%2F) The virtual host (URI segment) of RabbitMQ
- RABBITMQ_QUEUE: (varies) The queue that this extractor should watch for work to do
- RABBITMQ_PORT_5672_TCP_ADDR: (SERVICE_IP_ADDR) The address of RabbitMQ's message bus
- RABBITMQ_PORT_5672_TCP_PORT: (5672) The port that RabbitMQ is using for its message bus
- RABBITMQ_PORT_15672_TCP_ADDR: (SERVICE_IP_ADDR) The address of RabbitMQ's management interface
- RABBITMQ_PORT_15672_TCP_PORT: (15672) The port that RabbitMQ is using for its management interface
Bootstrapping Clowder in Kubernetes
Step 0: Clone Source Repository
git clone https://github.com/nds-org/nds-labs
cd nds-labs/
git fetch --all
git checkout lambert-dev
TODO: Once this is pushed up to master, remove the "checkout" step.
- You may need to give Clowder 10-15 minutes to discover this change.
- Verify that the extractors are registered to Clowder by navigating to: CLOWDER_URL:PORT/api/status
- You should see the extractors show up in their respective section after several minutes.
Bootstrapping Clowder in Kubernetes
Step 1: Build or Pull Required Images
Start up the NDSDEV container using the following command:
. ./devtools/ndsdev/ndsdevctl run
Once inside of NDSDEV, enter the services/ncsa/clowder folder:
cd /nds/src/services/ncsa/clowder
Pulling the Images from Docker Hub
From services/ncsa/clowder, run the following command to pull all required images:
make all
You should now see Docker pulling down several images. Once it is complete, you can move on to spinning up your cluster.
Building the Images from Source
WARNING: Building these images from source takes a VERY long time (~25 minutes). Pulling the images from docker hub is always preferred.
From services/ncsa/clowder, execute the following commands to build the required images from source:
make git-src
make all-from-src
Step 2: Bring Up an Empty Cluster
Setup an empty K8 cluster, as described here: Cluster Setup: Development (Local)
Step 3: Bring Up All Services
...
This step MUST be done before the controllers can be started. Create the 3 necessary services using the following command:
kubectl create -f services/
Why must this be done first?
Starting these services creates a bunch of environment variables containing ip/port/protocol of the service that are then injected into controllers/pods that match the selector. Any pods started after these services will have these environment variables injected into their pod's environment. This allows us to set values in the controller specification(s) like this one, for example:
- name: RABBITMQ_URI
value: "amqp://guest:guest@$(CLOWDER_RABBITMQ_PORT_5672_TCP_ADDR):$(CLOWDER_RABBITMQ_PORT_5672_TCP_PORT)/%2F"
Step
...
4: Bring
...
Up Any Controllers (In Order)
1.) To bring up a MongoDB controller and pod, run the following command:
kubectl create -f controllers/phasemongodb-1controller.yaml
2a2.) Decide if you wish to use search, and run this command if you do) (Optional) Run this command to start RabbitMQ:
kubectl create -f controllers/phase-2-optional/elasticsearch-rabbitmq-controller.yaml
2b3.) Decide if you want to use extractors, run this command as well, followed by commands to start your desired extractors Run this command last to bring up Clowder itself:
kubectl create -f controllers/phase-2-optional/rabbitmqclowder-controller.yaml
Step 5: Clowder GUI
Wait a minute or so for Clowder to start. Once it has started, you should be able to navigate your browser to port 30291 of your OpenStack VM's IP.
Clowder's controller specification describes a "NodePort" (port 30291) that exposes our cluster to the public internet.
Optional Plugins
Extractors
After starting RabbitMQ, you can run commands similar to the following in order to start your extractors:
kubectl create -f controllers/phase-2-optional/extractors/EXTRACTOR-1-controller.yaml
kubectl create -f controllers/phase-2-optional/extractors/EXTRACTOR-2-controller.yaml
...
kubectl create -f controllers/phase-2-optional/extractors/EXTRACTOR-N-controller.yaml
NOTE: Be sure to give Clowder up to 15 minutes to discover the change unless you plan on restarting the Clowder pod.
ElasticSearch (Text-based Search)
WARNING: The "latest" v2.2.0 image of elasticsearch would not work with Clowder. We need to investigate the latest usable version. v1.3.) Run this command last to bring up Clowder itself:.9 worked without errors.
kubectl create -f controllers/elasticsearch-controller.yaml
To enable searching from the browser, you will need to install the "head" plugin for elasticsearch by running the following command:
kubectl exec `kubectl get pods | grep lastic | awk '{print $1}'` -- /usr/share/elasticsearch/bin/plugin -i mobz/elasticsearch-head
TERRA Tool Server
More information to come...
Stopping the Cluster
To shut down the cluster, simply perform all steps above in reverse with delete instead of create.
From the services/ncsa/clowder, execute the following commands to shut down the whole cluster:
kubectl delete -f controllers/
kubectl delete -f services/
cd ../../..
. ./cluster/k8s/localdev/kube-down-local.shkubectl create -f controllers/phase-3
Tip and Tricks
Scaling Extractors
Use the following command to scale up or down an extractor on the fly:
...
NOTE: --replicas=0 is an acceptable value, allowing you to start a controller without starting its pods right away.
Hot-swapping Extractors
As you bring up and take down extractors, they will (every few minutes or so) automagically register / unregister themselves from Clowder. Based on the behavior, this is likely done via a polling process.
...