Mike, Ben, Kevin, Charles, Craig
Status:
- Kevin 25%
- Done with CCSN-MRI-SIMS
- Now on other projects (non NDS)
- Interested in Kubernetes deployment in OpenStack
- Charles 25%
- Normal schedule ahead
- Ben 50%
- Mike 25%
Current priorities:
- NCSA Industry conference demo on Wednesday
- Workbench > Spark integration
- Zeppelin > Spark via Livy
- Spark > MongoDB using the NBI dataset
- Workbench > Spark integration
- SC17
- Workbench > HPC integration
- TERRA-REF image stitching on ROGER (TORQUE/PBS)
- Jupyter > Agave API > ROGER
- Jupyter > Agave API > Comet (Singularity)?
- Comet access
- XSEDE may be easiest
- Workbench > HPC integration
- Options:
- Deployment in commercial cloud ( multi-node cluster with storage)
- No one is happy with the Kubernetes deployment process on OpenStack
- kubeadm is promising
- Shared storage
- https://www.minio.io/?
- Review of OpenShift Origin
- Kubernetes, Swarm, Mesos have more traction
- Singularity and Shifter
- Security
- Authentication and authorization
- Oauth support in Workbench
- LDAP or Oauth scopes for authorization
- Q. What does it mean to share authentication/authorization with
- HPC cluster such as ROGER (NCSA LDAP)
- Spark cluster? (Kerberos or nothing, what about Livy)
- Container and filesystem users/permissions
- Mapping UID/GIDs into running containers
- Q. Root escalation in Docker
- Data sharing and permissions
- Controlling access to data on the filesystem but also in active database such as NBI mongo.
- Same problem with Spark (pulling data from Mongo into Spark)
- More sophisticated network configuration
- Both for Kubernetes
- Also with the cloud provider (e.g., OpenStack project)
- Authentication and authorization
- Monitoring/LMA
- Addons https://kubernetes.io/docs/concepts/cluster-administration/addons/
- Prometheus?
- Nagios/NRPE is a stopgap
- Production maintenance
- Upgrading the beta instance
- Redeploying the beta?
- /var/lib/docker volume scaling
- Fixing inactiveTimeout for inactive accounts
- ETK/earthcube instances
- Cloud9 wily (and other Wily) upgrades; use Cloud9 all
- Process for removing old specs if in use.
- Clowder/Workbench plugin?
- Evaluate FRDR
- Other
- Drop-in UI nonsense
- Bower bug 410
- Deploy tools using configmap?
- Deploy tools using SMTP and standalone etcd
- /var/lib/docker and kubelet mount issues (may need depends on)
- Why are we using XFS for /media/storage?
- Deployment in commercial cloud ( multi-node cluster with storage)
Notes:
- Discussion of security in Spark
- Kevin: focused on network access control
- Getting off of the demo treadmill
- Need to really understand OpenShift
- Security model,
- Application/deployment model
- Easy Kubernetes deploy
- OpenStack (Nebula/SDSC)
- HA?
- Secure? Networking, TLS?
- Then in AWS, Azure (Big Data Hubs), GCE
- Security is the biggest thing for now
- Globus authentication
- TERRA-REF Use Case:
- Auth into Workbench: ideally this would SSO with Clowder/BETYdb/ROGER – same user/password. In the end, this is LDAP/NCSA Identity
- Container and filesystem permissions
- RunAs me
- Write files as me in my project
- PAM/SSSD in container
- Restrict access to some data to some users
- See sample data, but not full set
- SciServer – ACLs and data is only mounted in container if you are authorized
- One cluster: Workbench + Extractors
- Extractors need read-write access to core filesystems
- Users can have RO to core filesystems (shared data)
- TERRA has users directory on ROGER that I can mount via NFS
- Replace GlusterFS
- SSO via Oauth: need to do the work
- Authorization: where it all gets hairy
- Need ACLs
- Handle UID/GID
- Max needs to be able to run extractors
- Today, he needs to ssh into master
- Pile of extractor yaml files in admin repo
- "terraref" namespaces
- Extractors nodeSelector – nodes have RW access to the core data
- Hardcoded UIDs into the contaienr to run as filesystem owners
- Force the "RunAsX" model of OpenShift
- Kubernetes RBAC
- Today, he needs to ssh into master
Sprint 34 priorities:
- SC17 demo
- Production issues
- OpenShift eval
- Easy Kubernetes install on OpenStack?
- Oauth: things to do
- LDAP authentication?
- Change ApiServer
- Authorization model
- TERRA-USE case (who has access to what data)
- Who gets to control that (admin role)