Overview

Clowder currently offers a list of extractors that have been registered to the system. When there are too many extractors registered in the system, it can be overwhelming to the user and make it difficult for them to find the extractor they need. Furthermore, a lack of categorization makes it difficult to discover new extractors or to suggest improvements to existing extractors' maintainer.

Current behavior does not allow for an extractor to be "hidden", as the current admin-only list contains all extractors registered in Clowder.

Order and Options (independent of priorities)

First, we need to decide if this Extractor Catalog should be separate from the current admin-facing UI for listing extractors. The main question being: is this new Extractor Catalog a) an evolution of our current admin view, or b) a new view borne from new objectives with a new focus. There are pros and cons to each approach.

Option A yields a head-start on some of the boilerplate work of setting up the view, but adds the additional work of locking that view down to make sure it is truly read-only for normal users while maintaining existing admin functionality.

Option B yields a more user-focused UI that should prove to be more easily testable than one that also contains all of the admin functionality, at the cost of some additional boilerplate work in setting up the new view.

Suggested Improvements

We are proposing an enhanced view or set of views for searching and discovering extractors within the catalog.

High-Level Requirements

First and foremost, we will need to expose this catalog to non-admin users.

Some organizational tools would be very helpful, such as grouping extractors into "toolboxes" with a simple string tag as an identifier. This identifier could by indicative of the use or function for the extractor, or could indicate which group uses the extractor, or could even be completely arbitrary. Administrators should be allowed to choose who can Create new toolboxes for further classification of extractors.

The catalog should be sortable by extractors used the most by the user viewing the list.

Furthermore, the catalog should be searchable/filterable by one or more of the following:

It would also be nice to have some additional methods to debug or collect information about an extractor. There should be a way to view the extractor logs, as well as a way to be notified when a new version of an extractor is released. Job history and metrics should be made available, along with a flag indicating whether this extractor is ready for public consumption.

Specific Action Items

Navigation and Privacy

Discovery and Search

Organization

Diagnostics, Maintenance, and Feedback

Changing Assumptions

Some assumptions within Clowder will be changed by carrying out the above directives, including but not limited to:

Mockups

For presentation to the user, I propose using the term "label" instead of "toolbox" or "tag" or "group".

This term is not currently overloaded within Clowder and in my opinion more accurately conveys that this is simply a string identifier used for organization.

Catalog View

Log Viewer

Rate & Comment

Extractor Details View

Label Management View

Create New Label

Assign Labels

Comments View

Rate & Comment

History & Metrics View