Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We are proposing an enhanced view or set of views for searching and discovering extractors within the catalog.

High-Level Requirements

First and foremost, we will need to expose this catalog to non-admin users.

Some organizational tools would be very helpful, such as grouping extractors into "toolboxes" with a simple string tag as an identifier. This identifier could by indicative of the use or function for the extractor, or could indicate which group uses the extractor, or could even be completely arbitrary. Administrators should be allowed to choose who can Create new toolboxes for further classification of extractors.

The catalog should be sortable by extractors used the most by the user viewing the list.

Furthermore, the catalog should be searchable/filterable by one or more of the following:

  • a particular space
  • a particular file trigger
  • a particular metadata term

It would also be nice to have some additional methods to debug or collect information about an extractor. There should be a way to view the extractor logs, as well as a way to be notified when a new version of an extractor is released. Job history and metrics should be made available, along with a flag indicating whether this extractor is ready for public consumption.

Specific Action Items

Navigation and Privacy

...

  • Core set of metadata for Clowder to operate - allow admins to tag extractors as they currently can with files/datasets/collections
  • Core set of metadata on our instance's catalog - allow admins to tag extractors as they currently can with files/datasets/collections, inherit from above (indicate sources of tags, if possible)
  • Core set of metadata for specific toolbox - allow admins to tag extractors as they currently can with files/datasets/collections, inherit from above 2 (indicate sources of tags, if possible)
  • Sort my extractors by usage - list most recently/heavily used extractors first?

Diagnostics, Maintenance, and Feedback

  • See the logs files for an extractor - self-explanatory, linking to graylog or similar may be enough
  • Users should be able to comment on and rate extractors - new widget+API for comments and ratings, new db collection(s)
  • I want to know the job history of an extractor within the catalog - can lean on existing APIs where possible
  • Stats on extractor page based on metrics - we are not yet collecting these metrics such as CPU/MEM usage, so we will need to talk about how that that will work, but we can show # of uses, top users, etc
  • Notifications with extractor version changes - how do we let users know that a newer version of an extractor has been deployed?
  • Flag extractor in dev, staging, prod - allow extractor maintainers to set a "development status" on their running extractor to let users know if it is ready to use

...