...
- Shared home directory
- When capturing a tale, want an exact copy of the home directory at instance in time
- Relates to provenance, capturing current state, can be published
- Fast
- POSIX?
- Versioning:
- Conceptually similar to object stores – when you modify a file, you create a new version while potentially maintaining the old one
- Relates to reproducibility, allowing pointers to immutable versions of data
- Mountable anywhere
- sshfs
- Notifications: May be implemented in Fuse
...
Storage options
...
NCSA
- GPFS shared over NFS
- Storage condo: Managed by storage team
- ROGER
- ADS
- NCSA/Tech Services/Library
- $96/TB/year
- https://www.library.illinois.edu/rds/active-data-storage-overview/
- ZFS shared over NFS
- Santiago: Managed by ISDA
- ITS backup: Managed by ITS
- DXL: Managed by DXL
- OpenStack/Cinder (XFS exported via GlusterFS)
- Managed by Nebula team
For container-based storage, a common model is to create NFS or GlusterFS file server VMs in Nebula exporting Cinder volumes. GlusterFS has proven non-performant particularly for Docker images (slow untar).
SDSC
- SDSC Cloud/OpenStack
- Ceph via Cinder
- Swift
- Comet: Lustre shared over NFS
- Data Oasis?
- Project storage
NFS mounted storage
Hotel v Condo
http://research-it.ucsd.edu/_files/idi_vmware-cloudcompute-storage_posters_part2.pdf
TACC
- Rodeo/OpenStack?
- Wrangler: Lustre shared over NFS
- GlusterFS
- BRTFS/ZFS
- DataExpLab using ZFS for testing
- TACC Corral (GPFS
Notes
Ceph/CephFS
- Used by SDSC OpenStack (and 50+% of OpenStack survey respondents)
- CephFS? offers POSIX semantics
- Gluster v Ceph
- Ceph= object store
- Gluster = scale-out NAS and object store
- Both scale out linearly
- More Ceph v Gluster
- Gluster performs better at higher scales
- Majority of OpenStack implementations use Ceph
- Gluster is classic file-serving, second-tier storage
- Gluster = file storage with object capabilities; Ceph = object storage with block/file capabilities
...