Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is for the refactoring of the existing extractors. The original wiki page Hosted VMs is still used for the deployments. 

As we figure out who's working on what, please start with the following steps for the extractor(s) you chose:

Steps to take for every extractor in this list:

  1. Docker containers
  2. JSONLD
  3. Extractor info registration
  4. Use pyclowder (for python extractors)
  5. Add status messages to all extractors and fix level granularity
    1. Make status constants (DONE, ERROR)
    2. Arcgis multiprocessing extractor
  6. Register on on demand execution queues
    1. Add on demand key binding to configuration file: messageType = "*.file.text.plain", "extractors."+extractorName
  7. Standardize around python logging
    1. Figure out what to log and what format to follow
  8. Add logstash to docker compose
  9. Add sample input/ouput to git repository
  10. Add icon for tools catalog to git repository
  11. Add entry to Tools catalog, with icon

 

ID (Extractor Name from config file,

same as queue name)

Programming

Language

SoftwareOSAssigned ToLink to repoWho wrote or worked on the codeDEPLOYED      ncsa.image.ocrPythonTesseractLinuxRuiocr 

ncsa.cv.faces

PythonOpenCVLinuxInna (may be?)

ID (Extractor Name from config file, same as queue name)

Programming Language

SoftwareOSCan be Dockerized?Assigned ToRepoAuthor
DEPLOYED
ncsa.image.ocrPythonTesseractLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/ocr 

ncsa.cv.faces

PythonOpenCVLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/opencvLiana

ncsa.cv.eyes

PythonOpenCVLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/opencvLiana

ncsa.cv.closeups

PythonOpenCVLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/opencvLiana

ncsa.cv.profiles

PythonOpenCVLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/opencvLiana

ncsa.cellprofiler.fluorescentcomet

Pythonpymedici (question)WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.fly

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.human

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.silvercomet

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.speckle

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.trackobject

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.tumor

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.yeast

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.image.sphog

Python Matlab, mnist-sphog Linux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/handwritten/HandwrittenNumbers 

ncsa.image.caltech101

      
ncsa.bisque.histogram (notes: disabled)Python Linux    
ncsa.bisque.metadata (notes: disabled)Python Linux    
census-section-segmentorJava Linux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/censusLiana, Inna
ncsa.cv.river PythonOpenCV (python), convert (from imagemagick), and GdalLinux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/riverLiana
ncsa.geo.shpExtractorPythongdalLinux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-geo/browseJong Lee
ncsa.geo.tiffExtractorPythongdalLinux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-geo/browseJong Lee
ncsa.image.geotiffPython

GDAL, Cython, numpy,
pygeoprocessing

Linux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-geotiff/browseRui, Mostafa Elag

ncsa.image.ponddetect

PythonMatlabLinux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-maps/browse/feature_detectionMarcus, Ankit
ncsa.image.humanprefPythonMatlabLinux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-maps/browse/humanprefMarcus, Ankit

ncsa.xml.greenindexroute, ncsa.csv.greenindexroute

PythonOpenCVLinux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-maps/browse/greenrouteMarcus

ncsa.image.knn_numerals

PythonOpenCVLinux  Marcus

ncsa.audio.speech2text

JavaCMU Sphinx, ffmpeg, soxLinux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-core/browse/audio/speech2textMarcus
ncsa.audio.previewPython   Innahttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-core/browse/audio/preview 
ncsa.nlp.simplelanguagePythonnumpy  Innahttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-nlp/browse/SimpleLanguageLiana
ncsa.nlp.simplesummaryPython

Natural Language Toolkit (NLTK) for Python, NLTK Data or at least:

 nltk.corpus,nltk.stem.porter and nltk.tokenize.punkt.

  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-nlp/browse/SimpleSummaryLiana
ncsa.nlp.SNLPSentimentJava Stanford CoreNLP tool, java, maven  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-nlp/browse/SNLP/SNLPSentimentExtractorLiana, Marcus(?)
ncsa.nlp.wordtablesPython requestspikawin32com   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-nlp/browse/WordTablesExtractorLiana
siegfriedPython    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-siegfried/browseGregory Jansen
ncsa.versus.imageJavaVersusLinux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-versus/browseKenton, Smruti
ncsa.image.preview (note: check if really deployed. there is an extractor in Hosted VMs list with a similar name.)Python    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-core/browse/image/previewRob, Sandeep
ncsa.pdf.preview (note: check if really deployed. there is an extractor in Hosted VMs list with a similar name.)Python    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-core/browse/pdf/previewRob
ncsa.video.preview (note: check if really deployed. there is an extractor in Hosted VMs list with a similar name.)Python    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-core/browse/video/previewRob
NOT DEPLOYED
ncsa.image.digitpyPythonopencv   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/handwritten/SimpleDigitPython 

...

ncsa.cv.pdfimages pdfimages, from poppler-utils   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/poppler 
ncsa.cv.caltech101PythonMatlab and VLFeat 64-bit Mac OS   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/vlfeat 
dbpediaPython Natural Language Toolkit (NLTK) and rdflib.  
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
dbpedia/browse
/opencvLiana

ncsa.cv.eyes

PythonOpenCVLinuxInna
digestPython    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
digest/browse
/opencv
Liana
 
ncsa.
cv.closeups
hpcPython 
OpenCV
 
Linux
 
Inna
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
hpc/browse
/opencvLiana
LSVAJava    

ncsa.cv.profiles

PythonOpenCVLinuxInna
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
lsva/browse
/opencv
Liana

ncsa.cellprofiler.fluorescentcomet

Pythonpymedici (question)
, Constantinos
LSVA integrated   
Windows
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-lsva-
cv
integrated/browse
/cellprofiler
Liana
ncsa.
cellprofiler.fly
movieslicePython 
Windows
   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
movieslice/browse
/cellprofilerLiana
Sandeep
mri2meshPythonpymedici, subprocess, logging, os, numpy, shutil, zipfile  

ncsa.cellprofiler.human

Python Windows
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
mri/browse/
cellprofiler
mri2mesh
Lianancsa.cellprofiler.silvercomet
Marcus
msc-ChemCBCExtractorPythonrequests, 
pika, openpyxl, xlrd, pymongoLinux
Windows
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
msc/browse/
cellprofilerLiana

ncsa.cellprofiler.speckle

Python Windows
ChemCBCExtractorYan
msc-IsletExtractorPythonrequests, pika, openpyxl, xlrd, pymongoLinux 
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
msc/browse/
cellprofiler
IsletExtractor
Lianancsa.cellprofiler.trackobject
Yan
msc-MonitorExtractorPythonrequests, pika, openpyxl, xlrd, pymongoLinux 
Windows
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
msc/browse/
cellprofiler
MonitorExtractor
Liana
Yan
ncsa.
cellprofiler
msc.
tumor
dailymonitorPythonrequests, pika, openpyxl, xlrd, pymongo 
Windows
 
 
not usedhttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
msc/browse/
cellprofiler
OldMonitorExtractor
Lianancsa.cellprofiler.yeast
Ashwini
msc-PhenotypeExtractorPython

requests, pika, openpyxl, xlrd, pymongo

Linux 
Windows
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
msc/browse/
cellprofiler
PhenotypeExtractor
Liana
Yan
ncsa.
image
nlp.
sphog
SNLP
Python
Java 
Matlab, mnist-sphog
Stanford CoreNLP tool, java, maven 
Linux
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
nlp/browse/
handwritten
SNLP/
HandwrittenNumbers
SNLPExtractor
 
Liana
ncsa.
image.caltech101      ncsa.bisque.histogram (notes: disabled)Python Linux   ncsa.bisque.metadata (notes: disabled)Linux
nlp.tikaPython 
Tika project page, pymedici  
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-nlp/browse/tikaLiana
person-detectorPython MATLAB, FFMPEG, requests and pika  
census-section-segmentorJava Linux
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-person-
cv
detector/browse/
census
python
Liana, Inna
Sandeep
ncsa.
cv.river PythonOpenCV (python), convert (from imagemagick), and Gdal
person-trackerPythonpython, MATLAB, FFMPEG requests and pika  
Linux
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
person-tracking/browse/
river
python
Liana
Sandeep
ncsa
terra.
geo.shpExtractor
plantcvPython
gdal

pika
requests
wheel

  
Linux
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
geo
plantcv/browse
Jong Lee
Yan
medici_PTM_thumbnailsJava   
ncsa.geo.tiffExtractorPythongdalLinux
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
geo
ptm/browse
Jong Lee

ncsa.image.ponddetect

PythonMatlabLinux
/PTMThumbnailExtractorConstantinos
medici_PTM_metadataJava    
Marcus
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
maps
ptm/browse
/feature_detectionMarcus, Ankit
/PTMMetadataExtractorConstantinos

Name not clear

PtmMetadata(?)

Java    
ncsa.image.humanprefPythonMatlabLinuxMarcus
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
maps
ptm/browse/
humanprefMarcus, Ankit

ncsa.xml.greenindexroute, ncsa.csv.greenindexroute

PythonOpenCVLinux
PTMMetadataConstantinos
medici_ptm_mapsJava    
Marcus
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
maps
ptm/browse/
greenroute
PTMMapsExtractor
Marcus
Constantinos
medici_ptm_3dJava  

ncsa.image.knn_numerals

PythonOpenCVLinuxMarcus Marcus

ncsa.audio.speech2text

JavaCMU Sphinx, ffmpeg, soxLinuxMarcus Marcusncsa.nlp.simplelanguagePythonnumpy
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
nlp
ptm/browse/
SimpleLanguage
PTM3DExtractor
Lianancsa.nlp.simplesummaryPython

Natural Language Toolkit (NLTK) for Python, NLTK Data or at least:

Constantinos
medici_images_ptmJava  
 nltk.corpus,nltk.stem.porter and nltk.tokenize.punkt.
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
nlp
ptm/browse/
SimpleSummaryLianancsa.nlp.SNLPSentimentJava Stanford CoreNLP tool, java, maven
ImagesPTMExtractorConstantinos

extractors-rabbitmq

(look like examples)

     https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
nlp
rabbitmq/browse
/SNLP/SNLPSentimentExtractorLiana, Marcus(?)ncsa.nlp.wordtablesPython requestspikawin32com
 
Name not clear extractors-seabird/Scala    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS
/repos/extractors-nlp/browse/WordTablesExtractorLiana
/repos/extractors-seabird/browseLuigi
medici_3d_x3d (one of extractors-3d)Java
                     NOT DEPLOYED   
   
ncsa.image.digitpy (notes: not in the Wiki page)Pythonopencv
 
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
3d/browse/
handwritten/SimpleDigitPython
ObjJSONExtractor
 
Constantinos
medici_3d_obj_merger (one of extractors-3d)Java  
ncsa.cv.pdfimages (not in the wiki page) pdfimages, from poppler-utils
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
3d/browse/
poppler ncsa.cv.caltech101PythonMatlab and VLFeat 
OBJMergerExtractorConstantinos
medici_oni (one of extractors-3d)Java   
64-bit Mac OS 
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
3d/browse/
vlfeat dbpediaPython Natural Language Toolkit (NLTK) and rdflib. 
OniExtractorConstantinos
medici_ply_obj (one of extractors-3d)Java    
Luigi Marini
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
dbpedia
3d/browse
Luigi Marini
/PlyObjExtractorConstantinos
medici_3d_metadata (one of extractors-3d) Java 
digestPython
   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
digest
3d/browse/ThreeDMetadataExtractor
 
Constantinos
medici_x3d_html (one of extractors-3d) Java 
ncsa.image.geotiffPython

GDAL, Cython, numpy,
pygeoprocessing,
pika,
requests

LinuxRuihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-geotiff/browseRui, Mostafa Elagncsa.hpcPython
   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
hpc
3d/browse
 LSVAJava  
/X3DhtmlExtractorConstantinos
ncsa.arcgis.landsat7mosaicPythonArcGISWindowsNo
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
lsva
bd-cz/browse
 LSVA integrated   
/ndviextractorSmruti
ncsa.arcgis.floodplainPythonArcGISWindowsNohttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
lsva
bd-
integrated
cz/browse
 
/terex_floodplain/config.pySmruti
medici_bookJava 
ncsa.movieslicePython
   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
movieslice
books/browse/BookPreviewExtractor
Sandeepmri2meshPythonpymedici, subprocess, logging, os, numpy, shutil, zipfile
Theerasit Issaranon
medici_image_pyramidJava    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
mri
books/browse/
mri2meshMarcus
ImagePreviewPyramidExtractor-shebookTheerasit Issaranon
shebookJava 
msc-ChemCBCExtractorPython
   

https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-

msc

books/browse/

ChemCBCExtractorYanmsc-IsletExtractorPythonrequests, pika, openpyxl, xlrd, pymongo 

SheBookPreviewExtractor/src/BookPreviewExtractor

 

https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-

msc/browse/IsletExtractorYanmsc-MonitorExtractorPythonrequests, pika, openpyxl, xlrd, pymongo

books/browse/SheBookPreviewExtractor/src/bookpreviewextractor

Theerasit Issaranon
lsva-ceddJava    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
msc
cedd/browse
/MonitorExtractor
Yan
Constantinos
ncsa.
msc.dailymonitor
cinemetricsPython
requests, pika, openpyxl, xlrd, pymongo
    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
msc
cinemetrics/browse
/OldMonitorExtractor
Ashwini
Constantinos
ncsa.image.metadata
msc-PhenotypeExtractor
Python
requests, pika, openpyxl, xlrd, pymongo
    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
msc
core/browse/image/
PhenotypeExtractor
metadata
Yan
Max. Rob
ncsa.
nlp
debod.
SNLP
segmentor 
Java
  
Stanford CoreNLP tool, java, maven
  https://opensource.ncsa.illinois.edu/bitbucket/projects/
CATS
DEBOD/repos/extractors-
nlp
cellsegmentor/browse
/SNLP/SNLPExtractor
Liana
 
ncsa.
nlp
image.
tika
dmp
Python
  
Tika project page, pymedici
   

https://opensource.ncsa.illinois.edu/bitbucket/projects/

CATS

DEBOD/repos/extractors-

nlp

debod/browse

/tikaLianaperson-detectorPython

 MATLAB, FFMPEG, requests and pika  

https://opensource.ncsa.illinois.edu/bitbucket/projects/

CATS

DEBOD/repos/extractors-

person-detector

dmp/browse

/python

Sandeep
 
ncsa
.person-trackerPythonpython, MATLAB, FFMPEG requests and pika
.image.sphog.debod     https://opensource.ncsa.illinois.edu/bitbucket/projects/
CATS
DEBOD/repos/extractors-
person-tracking
handwrittendecimals/browse
/python
Sandeep
 
terra.plantcvPython

pika
requests
wheel

ncsa.image.iarp_remove_circle

     https://opensource.ncsa.illinois.edu/bitbucket/projects/
CATS
IARP/repos/
extractors-plantcv/browseYan
image_fetcher/browse/extractors/remove_circleMarcus
ncsa.cv.meangrey
medici_PTM_thumbnails     COnstantinos   
     
      
https://opensource.ncsa.illinois.edu/bitbucket/projects/IARP/repos/image_fetcher/browse/extractors/mean_greyMarcus