Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is for the refactoring of the existing extractors. The original wiki page Hosted VMs is still used for the deployments. 

 

As we figure out who's working on what, please start with the following steps for the extractor(s) you chose:

Steps to take for every extractor in this list:

  1. Docker containers
  2. JSONLD
  3. Extractor info registration
  4. Use pyclowder (for python extractors)
  5. Add status messages to all extractors and fix level granularity
    1. Make status constants (DONE, ERROR)
    2. Arcgis multiprocessing extractor
  6. Register on on demand execution queues
    1. Add on demand key binding to configuration file: messageType = "*.file.text.plain", "extractors."+extractorName
  7. Standardize around python logging
    1. Figure out what to log and what format to follow
  8. Add logstash to docker compose
  9. Add sample input/ouput to git repository
  10. Add icon for tools catalog to git repository
  11. Add entry to Tools catalog, with icon

 

ID (Extractor Name from config file, same as queue name)

Programming Language

SoftwareOSCan be Dockerized?Assigned ToRepoAuthor
DEPLOYED
ncsa.image.ocrPythonTesseractLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/ocr 

ncsa.cv.faces

PythonOpenCVLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/opencvLiana

ncsa.cv.eyes

PythonOpenCVLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/opencvLiana

ncsa.cv.closeups

PythonOpenCVLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/opencvLiana

ncsa.cv.profiles

PythonOpenCVLinux Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/opencvLiana

ncsa.cellprofiler.fluorescentcomet

Pythonpymedici (question)WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.fly

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.human

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.silvercomet

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.speckle

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.trackobject

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.tumor

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.cellprofiler.yeast

Python WindowsNo https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/cellprofilerLiana

ncsa.image.sphog

Python Matlab, mnist-sphog Linux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/handwritten/HandwrittenNumbers 

ncsa.image.caltech101

      
ncsa.bisque.histogram (notes: disabled)Python Linux    
ncsa.bisque.metadata (notes: disabled)Python Linux

ID (Extractor Name from config file,

same as queue name)

Programming

Language

SoftwareOSAssigned ToLink to repoWho wrote or worked on the codeDEPLOYED  
    
ncsa.image.ocrPythonTesseractLinuxRuiocr
census-section-segmentorJava Linux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/censusLiana, Inna
 
ncsa.cv.
facesPythonOpenCVLinux
river PythonOpenCV (python), convert (from imagemagick), and GdalLinux 
Inna (may be?)
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-cv/browse/
opencv
riverLiana
ncsa.
cv
geo.
eyes
shpExtractorPython
OpenCV
gdalLinux 
Inna
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
geo/browse
/opencv
Liana
Jong Lee
ncsa.
cv
geo.
closeups
tiffExtractorPython
OpenCV
gdalLinux 
Inna
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
geo/browse
/opencv
Liana
Jong Lee
ncsa.
cv
image.
profiles
geotiffPython
OpenCV

GDAL, Cython, numpy,
pygeoprocessing

Linux 
Inna
Ruihttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
geotiff/browse
/opencv
Rui, Mostafa Elag
Liana

ncsa.

cellprofiler

image.

fluorescentcomet

ponddetect

Python
pymedici (question)
Matlab
Windows
Linux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
maps/browse/
cellprofiler
feature_detectionMarcus, Ankit
Liana
ncsa.
cellprofiler
image.
fly
humanprefPython
 
Matlab
Windows
Linux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
maps/browse/
cellprofiler
humanpref
Liana
Marcus, Ankit

ncsa.

cellprofiler.human

xml.greenindexroute, ncsa.csv.greenindexroute

Python
 
OpenCV
Windows
Linux https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
maps/browse/
cellprofiler
greenroute
Liana
Marcus

ncsa.

cellprofiler

image.

silvercomet

knn_numerals

PythonOpenCVLinux  
Windows
Marcus

ncsa.audio.speech2text

JavaCMU Sphinx, ffmpeg, soxLinux 
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
core/browse/audio/
cellprofiler
speech2text
Liana
Marcus
ncsa.
cellprofiler
audio.
speckle
previewPython 
Windows
  Innahttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
core/browse/audio/
cellprofiler
preview
Liana
 
ncsa.
cellprofiler
nlp.
trackobject
simplelanguagePythonnumpy 
Windows
 
 
Innahttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
nlp/browse/
cellprofiler
SimpleLanguageLiana
ncsa.nlp.simplesummaryPython

Natural Language Toolkit (NLTK) for Python, NLTK Data or at least:

 nltk.corpus,nltk.stem.porter and nltk.tokenize.punkt.

  
Liana

ncsa.cellprofiler.tumor

Python Windows 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
nlp/browse/
cellprofiler
SimpleSummaryLiana
ncsa.
cellprofiler
nlp.
yeast
SNLPSentiment
Python
Java 
Stanford CoreNLP tool, java, maven 
Windows
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
nlp/browse/SNLP/
cellprofiler
SNLPSentimentExtractorLiana, Marcus(?)
ncsa.
image
nlp.
sphog
wordtablesPython 
Matlab, mnist-sphog
requestspikawin32com 
Linux
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv/browse/handwritten/HandwrittenNumbers 
  • ncsa.image.caltech101
      ncsa.bisque.histogram (notes: disabled)Python Linux   ncsa.bisque.metadata (notes: disabled)Python Linux   census-section-segmentorJava 
nlp/browse/WordTablesExtractorLiana
siegfriedPython   
Linux
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
siegfried/browse
/census
Gregory Jansen
Liana, Inna
ncsa.
cv
versus.
river
image
 PythonOpenCV (python), convert (from imagemagick), and Gdal
JavaVersusLinux
 
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
versus/browse
/river
Liana
Kenton, Smruti
ncsa
.geo.shpExtractor
.image.preview (note: check if really deployed. there is an extractor in Hosted VMs list with a similar name.)Python   
PythongdalLinux
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
geo
core/browse
/image/previewRob, Sandeep
ncsa.pdf.preview (note: check if really deployed. there is an extractor in Hosted VMs list with a similar name.)Python   
Jong Leencsa.geo.tiffExtractorPythongdalLinux
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
geo
core/browse
Jong Lee

ncsa.image.ponddetect

PythonMatlabLinux
/pdf/previewRob
ncsa.video.preview (note: check if really deployed. there is an extractor in Hosted VMs list with a similar name.)Python    
Marcus
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
maps
core/browse
/feature_detection
/video/previewRob
NOT DEPLOYED
Marcus, Ankit
ncsa.image.
humanpref
digitpyPythonopencv
Matlab
 
Linux
 
Marcus
 https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
maps
cv/browse/handwritten/
humanpref
SimpleDigitPython
Marcus, Ankit
 
ncsa.
xml.greenindexroute, ncsa.csv.greenindexroute
cv.pdfimages pdfimages, from poppler-utils   
PythonOpenCVLinuxMarcus
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
maps
cv/browse/
greenroute
poppler
Marcus
 
ncsa.
image
cv.
knn_numerals
caltech101Python
OpenCVLinuxMarcus Marcus
Matlab and VLFeat 64-bit Mac OS 

ncsa.audio.speech2text

JavaCMU Sphinx, ffmpeg, soxLinuxMarcus Marcusncsa.nlp.simplelanguagePythonnumpy
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
nlp
cv/browse/
SimpleLanguage
vlfeat
Lianancsa.nlp.simplesummary
 
dbpediaPython Natural Language Toolkit (NLTK)
 for Python, NLTK Data or at least: nltk.corpus,nltk.stem.porter and nltk.tokenize.punkt
 and rdflib.  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
nlp
dbpedia/browse
/SimpleSummary
Lianancsa.nlp.SNLPSentimentJava Stanford CoreNLP tool, java, maven
digestPython    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
nlp
digest/browse
/SNLP/SNLPSentimentExtractorLiana, Marcus(?)
 
ncsa.
nlp.wordtables
hpcPython 
requestspika,
 
win32com
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
nlp
hpc/browse
/WordTablesExtractorLiana
LSVAJava 
siegfriedPython
   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
siegfried
lsva/browse
Gregory Jansenncsa.versus.image
Liana, Constantinos
LSVA integrated  
Java
  
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-lsva-
versus/browseKenton, Smruti                     NOT DEPLOYED      ncsa.image.digitpy (notes: not in the Wiki page)Python
integrated/browse
ncsa.movieslicePython  
opencv
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
movieslice/browse
/handwritten/SimpleDigitPython
 ncsa.cv.pdfimages (not in the wiki page) pdfimages, from poppler-utils
Sandeep
mri2meshPythonpymedici, subprocess, logging, os, numpy, shutil, zipfile   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
mri/browse/
poppler
mri2mesh
 
Marcus
msc-ChemCBCExtractor
ncsa.cv.caltech101
Python
Matlab and VLFeat 64-bit Mac OS 
requests, pika, openpyxl, xlrd, pymongoLinux 
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
cv
msc/browse/
vlfeat
ChemCBCExtractor
 
Yan
dbpedia
msc-IsletExtractorPython
 Natural Language Toolkit (NLTK) and rdflib. 
requests, pika, openpyxl, xlrd, pymongoLinux 
Luigi Marini
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
dbpediadigestPython
msc/browse/IsletExtractor
Luigi Marini
Yan
msc-MonitorExtractorPythonrequests, pika, openpyxl, xlrd, pymongoLinux
 
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
digest
msc/browse/MonitorExtractor
 
Yan
ncsa.
image
msc.
geotiff
dailymonitorPython
GDAL
requests,
Cython
pika,
numpy
openpyxl,

pygeoprocessing
xlrd, pymongo
pika,
requests
  not used
LinuxRui
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
geotiff
msc/browse
Rui, Mostafa Elag
/OldMonitorExtractorAshwini
msc-PhenotypeExtractorPython

requests, pika, openpyxl, xlrd, pymongo

Linux
ncsa.hpcPython
 
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
hpc
msc/browse/PhenotypeExtractor
 
Yan
ncsa.nlp.SNLP
LSVA
Java Stanford CoreNLP tool, java, maven   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
lsva
nlp/browse
 LSVA integrated
/SNLP/SNLPExtractorLiana
ncsa.nlp.tikaPython Tika project page, pymedici
 
  
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
lsva-integrated
nlp/browse/tika
 
Liana
person-detector
ncsa.movieslice
Python MATLAB, FFMPEG, requests and pika   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-person-
movieslice
detector/browse/pythonSandeep
mri2mesh
ncsa.person-trackerPython
pymedici, subprocess, logging, os, numpy, shutil, zipfile
python, MATLAB, FFMPEG requests and pika   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-person-
mri
tracking/browse/
mri2mesh
python
Marcus
Sandeep
terra.plantcvPython

pika
requests
wheel

msc-ChemCBCExtractorPython

   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
msc
plantcv/browse
/ChemCBCExtractor
Yan
msc-IsletExtractorPythonrequests, pika, openpyxl, xlrd, pymongo
medici_PTM_thumbnailsJava    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
msc
ptm/browse/
IsletExtractorYanmsc-MonitorExtractorPythonrequests, pika, openpyxl, xlrd, pymongo
PTMThumbnailExtractorConstantinos
medici_PTM_metadataJava    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
msc
ptm/browse/
MonitorExtractorYanncsa.msc.dailymonitorPythonrequests, pika, openpyxl, xlrd, pymongo
PTMMetadataExtractorConstantinos

Name not clear

PtmMetadata(?)

Java    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
msc
ptm/browse/
OldMonitorExtractorAshwinimsc-PhenotypeExtractorPythonrequests, pika, openpyxl, xlrd, pymongo
PTMMetadataConstantinos
medici_ptm_mapsJava    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
msc
ptm/browse/
PhenotypeExtractor
PTMMapsExtractor
Yan
Constantinos
medici_ptm_3d
ncsa.nlp.SNLP
Java  
Stanford CoreNLP tool, java, maven
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
nlp
ptm/browse/
SNLP/SNLPExtractorLianancsa.nlp.tikaPython Tika project page, pymedici
PTM3DExtractorConstantinos
medici_images_ptmJava    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
nlp
ptm/browse/
tika
ImagesPTMExtractor
Liana
Constantinos
person-detectorPython MATLAB, FFMPEG, requests and pika

extractors-rabbitmq

(look like examples)

     https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
person-detector
rabbitmq/browse
/python
Sandeepncsa.person-trackerPythonpython, MATLAB, FFMPEG requests and pika
 
Name not clear extractors-seabird/Scala    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
person-tracking
seabird/browse
/python
Sandeep
Luigi
medici_3d_x3d (one of extractors-3d)Java  
terra.plantcvPythonpika
requests
wheel
  https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
plantcv
3d/browse/ObjJSONExtractor
Yan
Constantinos
medici_3d_
PTM_thumbnails
obj_merger (one of extractors-3d)Java    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
ptm
3d/browse/
PTMThumbnailExtractor
OBJMergerExtractorConstantinos
medici
_PTM_metadata
_oni (one of extractors-3d)Java    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
ptm
3d/browse/
PTMMetadataExtractorPtmMetadata(?
OniExtractorConstantinos

Name not clear

medici_ply_obj (one of extractors-3d)Java    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
ptm
3d/browse/
PTMMetadata
PlyObjExtractorConstantinos
medici_
ptm_maps
3d_metadata (one of extractors-3d) Java    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
ptm
3d/browse/
PTMMapsExtractor
ThreeDMetadataExtractorConstantinos
medici_
ptm_3d
x3d_html (one of extractors-3d) Java    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
ptm
3d/browse/
PTM3DExtractor
X3DhtmlExtractorConstantinos
medici_images_ptmJava  
ncsa.arcgis.landsat7mosaicPythonArcGISWindowsNo
 
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
ptm
bd-cz/browse/
ImagesPTMExtractor
ndviextractor
Constantinos

extractors-rabbitmq

(look like examples)

   
Smruti
ncsa.arcgis.floodplainPythonArcGISWindowsNohttps://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-bd-
rabbitmq
cz/browse
 
/terex_floodplain/config.pySmruti
medici_bookJava 
Name not clear extractors-seabird/Scala
   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
seabird
books/browse/BookPreviewExtractor
Luigi
Theerasit Issaranon
medici_
3d_x3d (one of extractors-3d)
image_pyramidJava    https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos
/extractors-3d/browse/ObjJSONExtractorConstantinosmedici_3d_obj_merger (one of extractors-3d)
/extractors-books/browse/ImagePreviewPyramidExtractor-shebookTheerasit Issaranon
shebookJava 
Java
   

https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-

3d

books/browse

/OBJMergerExtractorConstantinosmedici_oni (one of extractors-3d)Java  

/SheBookPreviewExtractor/src/BookPreviewExtractor

 

https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-

3d

books/browse/SheBookPreviewExtractor/src/

OniExtractorConstantinosmedici_ply_obj (one of extractors-3d)

bookpreviewextractor

Theerasit Issaranon
lsva-ceddJava 
Java
   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
3d
cedd/browse
/PlyObjExtractor
Constantinos
medici_3d_metadata (one of extractors-3d) 
ncsa.cinemetricsPython 
Java
   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
3d
cinemetrics/browse
/ThreeDMetadataExtractor
Constantinos
medici_x3d_html (one of extractors-3d) 
ncsa.image.metadataPython 
Java
   https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-
3d
core/browse/image/
X3DhtmlExtractor
metadata
Constantinos
Max. Rob
ncsa.
arcgis
debod.
landsat7mosaic
segmentor 
Python
    https://opensource.ncsa.illinois.edu/bitbucket/projects/
CATS
DEBOD/repos/extractors-
bd-cz
cellsegmentor/browse
/ndviextractor
Smruti
 
ncsa.
arcgis
image.
floodplain
dmp 
Python
    

https://opensource.ncsa.illinois.edu/bitbucket/projects/

CATS

DEBOD/repos/extractors-

bd-cz/browse/terex_floodplain/config.py

debod/browse

https://opensource.ncsa.illinois.edu/bitbucket/projects/DEBOD/repos/extractors-dmp/browse

 
ncsa.image.sphog.debod  
Smrutimedici_bookJava
   https://opensource.ncsa.illinois.edu/bitbucket/projects/
CATS
DEBOD/repos/extractors-
books
handwrittendecimals/browse
/BookPreviewExtractor
Theerasit Issaranonmedici_image_pyramid
 

ncsa.image.iarp_remove_circle

  
Java
   https://opensource.ncsa.illinois.edu/bitbucket/projects/
CATS
IARP/repos/
extractors-books
image_fetcher/browse
/ImagePreviewPyramidExtractor-shebookTheerasit Issaranon
/extractors/remove_circleMarcus
ncsa.cv.meangrey  
shebookJava
   https://opensource.ncsa.illinois.edu/bitbucket/projects/
CATS
IARP/repos/
extractors-books
image_fetcher/browse/
SheBookPreviewExtractor/src/BookPreviewExtractorTheerasit Issaranon
extractors/mean_greyMarcus