Create extractor-XXXX vm and deploy the ocr and nlp extractors found in dts1 (simpleocrextractor.py, languageextractor.py, simplesummaryextractor.py).
The conf files for the upstart scripts can be found in dts1. I think the best way to do it is to copy the extractors from the dts1 vm since some configurations are not in the original code.
The requirements for each of the extractors can be found here:
https://opensource.ncsa.illinois.edu/stash/projects/MMDB/repos/extractors-cv/browse/ocr/SimpleOCR
https://opensource.ncsa.illinois.edu/stash/projects/MMDB/repos/extractors-nlp/browse/SimpleLanguage
https://opensource.ncsa.illinois.edu/stash/projects/MMDB/repos/extractors-nlp/browse/SimpleSummary