You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Building on Gluster Alternatives and Cloud Provider Alternatives but with the Whole Tale requirements.

Requirements

Use case: A user creates a new tale based on an existing dataset (readonly). The notebook uses the data to produce new outputs.  This is published as a new tale (ideally the notebook is part of the permanent data for the tale, captured in the workspace).

From the Whole Tale project:

  • Shared home directory
    • When capturing a tale, want an exact copy of the home directory at instance in time
    • Relates to provenance, capturing current state, can be published
  • Fast
  • POSIX?
  • Versioning:
    • Conceptually similar to object stores – when you modify a file, you create a new version while potentially maintaining the old one
    • Relates to reproducibility, allowing pointers to immutable versions of data
  • Mountable anywhere
    • sshfs
  • Notifications: May be implemented in Fuse

Options

  • minio
  • NFS
  • GlusterFS
  • BRTFS/ZFS
    • DataExpLab using ZFS for testing
  • TACC Corral (GPFS)



Block v object storage


Notes

POSIX

Object storage

  • Limited access functions (PUT/GET/DELETE/HOST/HEAD)

Flocker:

Ceph

  • Used by SDSC OpenStack
  • Gluster v Ceph
    • Ceph= object store
    • Gluster = scale-out NAS and object store
    • Both scale out linearly
  • More Ceph v Gluster
    • Gluster performs better at higher scales
    • Majority of OpenStack implementations use Ceph
    • Gluster is classic file-serving, second-tier storage
    • Gluster = file storage with object capabilities; Ceph = object storage with block/file capabilities

Rook

minio

NFS

  • Single point of failure
  • NFSv2 and NFSv3 have host-based authentication (1). Access control through host and file/directory permissions only.
  • NFSv4 has improved security via Kerberos and ACLs
  • NFS Ganesha (user level NFS server)

GlusterFS

  • Parallel network file storage system
  • Good for large static files; immutable files
  • Bad for lots of small files; resulting in split brain;
  • More complex backup/restore
  • Performance degradation under certain load scenarios
  • Hard to administer (see Nebula)
  • Network authentication, POSIX ACLs
  • Version 3.7 supports NFSv4 and pNFS


BTRFS

Luster

Other

  • No labels