Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • ncsa.archival.disk: Moves the file from one specially-designated folder on disk to another (requires access to Clowder's files on disk)
  • ncsa.archival.s3: Changes the Storage Class of a file an object stored in S3

The two options cannot currently be mixed, meaning that if Clowder uses DiskByteStorage then you must use the Disk archiver.

If neither of the above two extractors fit your use case, pyclowder can be used to quickly create a new archival extractor that fits your needs.

Table of Contents

Process Overview

When a file is first uploaded, it is placed into a temp folder and created in the DB with the state CREATED.

At this point, users can start associating metadata with the new file, even though the actual file bytes are not yet available through Clowder's API.

Clowder then begins transferring the file bytes to the configured ByteStorage driver.

Once the bytes are completely uploaded into Clowder and done transferring to the data store, the file is marked as PROCESSED.

At this point users can access the file bytes via the Clowder API and UI, and are able download the file as normal.

If the admin has configured the archival feature (see above), then the user is also offered a button to Archive the file.

On Archive

If a user chooses to Archive the file, then it is sent to the configured archival extractor with a parameter of operation=unarchive.

The extractor performs whatever operation it deems as "archiving" - for example, copying to a network file system.

Finally the file is marked as ARCHIVED, and (if configured) the user is given the option to Unarchive the file.

If the user attempts to download an ARCHIVED file, then they should be presented with a prompt to notify the admin for a request to unarchive.

On Unarchive

If a user chooses to Unarchive a file, then it is sent to the configured archival extractor with a parameter of operation=unarchive.

The extractor performs the inverse of whatever operation that it previously defined as "archiving", bringing the file bytes back to where Clowder can access them for download.

Finally the file is marked as PROCESSED, and the user should be once again given the option to Archive the file and requests to download the file bytes should succeed.

Automatic File Archival

If configured (see above), Clowder can automatically archive files of sufficient size after a predetermined period of inactivity. 

By default, files that are over 1MB and have not been downloaded in that last 90 days will be automatically archived.

Both the file size and the inactivity period can be configured according to your preferences.

Configuration Options / Defaults for Clowder

...

NOTE: on MacOSX, you may need to run the extractor with the --net=host option to connect to RabbitMQ

Process Overview

When a file is first uploaded, it is placed into a temp folder and created in the DB with the state CREATED.

At this point, users can start associating metadata with the new file, even though the actual file bytes are not yet available through Clowder's API.

Clowder then begins transferring the file bytes to the configured ByteStorage driver.

Once the bytes are completely uploaded into Clowder and done transferring to the data store, the file is marked as PROCESSED.

At this point users can access the file bytes via the Clowder API and UI, and are able download the file as normal.

If the admin has configured the archival feature (see above), then the user is also offered a button to Archive the file.

On Archive

If a user chooses to Archive the file, then it is sent to the configured archival extractor with a parameter of operation=unarchive.

The extractor performs whatever operation it deems as "archiving" - for example, copying to a network file system.

Finally the file is marked as ARCHIVED, and (if configured) the user is given the option to Unarchive the file.

If the user attempts to download an ARCHIVED file, then they should be presented with a prompt to notify the admin for a request to unarchive.

On Unarchive

If a user chooses to Unarchive a file, then it is sent to the configured archival extractor with a parameter of operation=unarchive.

The extractor performs the inverse of whatever operation that it previously defined as "archiving", bringing the file bytes back to where Clowder can access them for download.

Finally the file is marked as PROCESSED, and the user should be once again given the option to Archive the file and requests to download the file bytes should succeed.

Automatic File Archival

If configured (see above), Clowder can automatically archive files of sufficient size after a predetermined period of inactivity. 

By default, files that are over 1MB and have not been downloaded in that last 90 days will be automatically archived.

Both the file size and the inactivity period can be configured according to your preferences.