Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Option 1: Don't do it

Option 2: Point at test and production services

Option 3 (Preferred by all if the effort is not too great): Make spaces, services, and repositories 'test-aware' via a flag

Proposal:

The "Purpose" flag currently in use will be extended to support "Testing-Only" (the current value), and/or "Production". Repositories wishing to do both through one profile would add both values (i.e. as a JSON Array)

The Staging Area will be modified to have the user select test or production before the matchmaking stage so that only repositories that support the request will be listed. (Since we are treating this as a go/no-go decision unlike other rules, which are only advisory, the proposal is to not list the repositories that can't support the request. Alternately we could cause them to be grayed-out/ use some other visual cue to indicate that the request will definitely fail.)

Matchmaker will look at the Purpose on the request and in the profiles and determine the match. This will be implemented as a rule. Initially, the matchmaker engine will remove entries that don't meet this requirement. Going forward, we could extend the rule definition to allow rules to be mandatory, so that the Staging Area could show repositories that cannot process the request while still stopping people from sending such requests. 

When a user submits a pub request, it is processed as is currently done by services and repositories with the following changes that depend on the Purpose flag setting:

Services: The Purpose setting should be added to the summaries provided by api/researchobjects, api/repositories/<repo_id>/researchobjects and /api/repositories/<repo_id>/researchobjects/new endpoints, and/or add a ?purpose=<x> flag to these endpoints so that repositories can filter the list.

Repositories: Repositories should retrieve and process the requests that match their profile "Purpose" setting. (Nominally all requests will have the right setting because Clowder won't send ones that don't match, but switching the profile setting could result in some that don't match, etc. so a quick test is advisable. The expectation is, as it is now, that test objects will be processed in a way that will not result in a persistent public identifier and ongoing storage of the data by the repository and in a way that mimics the processing of real objects to some useful degree. How these are defined is up to the repository. (Our reference repository and IU Cloud will end up minting test DOIs that last for two weeks, running some form of cleanup code that removes data from the repository, and will mimic production objects by presenting a temporary landingpage that is similar to or exactly the same as it would be for a production publication. Other repositories could not mint a PID for test objects, create a real ID and mark it as invalid at some later date, perform internal processing but not display a landing page, etc. It is advisable that repositories send status messages back that indicate how processing of the test object did/did not mimic a production request.)

Services: When a success message is returned, the service layer will decide, based on the Purpose flag, whether to send FGDC metadata to the production or test DataOne instance.

Services: As a default, the service layer will remove test objects after 2 weeks. We could consider making this period repository and/or identifier specific, or requiring repositories to send an 'invalid/deleted' status message, and/or have the service layer also mark the request as obsolete (rather than deleting it as is currently done for test objects after 2 weeks). The current implementation, which (I think) triggers off the use of a temporary DOI (recognized by its shoulder) would instead trigger generically off the presence of the Purpose flag having a test value.

Spaces: Spaces should visually indicate test instances, using the value of the Purpose Flag to do so.It should be clear wherever the returned PID is shown that it should not be used as a long-termID/ that it is temporary. (E.g. SEAD 1.5 marks test requests with a different color and shows a "Temporary DOI" flag next to the ID on test objects in the space's Published Data page.)

Services: Currently, if the Purpose is test, the API will allow a DELETE on a pub request to succeed even after a repository has started processing the request (non-test requests cannot be deleted once the repository has responded with an initial status message). As a start, we can leave this as is. If, in the future, we want test object requests that cannot be deleted (e.g. by a user in the staging area) once a repo has started working on it, we could adjust this (add another flag). Until then, repositories should be robust against errors (e.g. 404) when they try to submit status for test objects that may have been deleted.

Repository Specific Changes:

The Reference Repository can work in this mode already (e.g. SEAD Internal, UM-ARC, NDS test) and we will just need to update their profiles to include the appropriate Purpose values to make them test only, production, or both. One useful update would be to automatically sync the configuration with the profile (right now, whether to generate test or production DOIs is a config option and, if the test flag differs, the code generates a warning log entry and stops and the curator processing the request has to change the config to match to go forward (basically a safety check to avoid creating the wrong type of ID since we currently do not filter requests coming from the space with an incorrect Purpose flag.). The reference repository landing page could also adopt the color/warning notice that the DOI is temporary that exists in 1.5 based on the flag value. It would also make sense to update the ref repo library to positively test for the production flag as well (versus just checking the presence/absence of the test value).

IU Cloud: Since the IU Cloud repository uses the re-repo library, it has the basic capability to switch between test and production DOI generation, but, if I understand correctly, it is currently configured as two separate web apps (at http://seadva.d2i.indiana.edu:8081/landing-page/home.html and http://seadva-test.d2i.indiana.edu/landing-page/home.html) that in turn rely on profiles on the production and test instances of the services (note the seadva-test profile incorrectly identifies the production IU Cloud as the repository URL in its profile rather than it's own URL). The IU Cloud Search/listing capabilities depend on live api requests to the particular service instance, and the automated publishing and clean-up scripts that are running against the test services do not yet look at the Purpose flag.

Interaction with republishing/revision capabilities: