Mike, Ben, Kevin, Charles, Craig


Status:

  • Kevin 25%
    • Done with CCSN-MRI-SIMS
    • Now on other projects (non NDS)
    • Interested in Kubernetes deployment in OpenStack
  • Charles 25%
    • Normal schedule ahead
  • Ben 50%
  • Mike 25%

Current priorities:

  • NCSA Industry conference demo on Wednesday
    • Workbench > Spark integration
      • Zeppelin > Spark via Livy
      • Spark > MongoDB using the NBI dataset
  • SC17
    • Workbench > HPC integration
      • TERRA-REF image stitching on ROGER (TORQUE/PBS)
      • Jupyter > Agave API > ROGER
      • Jupyter > Agave API > Comet (Singularity)?
    • Comet access
      • XSEDE may be easiest
  • Options:
    • Deployment in commercial cloud ( multi-node cluster with storage)
      • No one is happy with the Kubernetes deployment process on OpenStack
      • kubeadm is promising
      • Shared storage
      • https://www.minio.io/?
    • Review of OpenShift Origin
      • Kubernetes, Swarm, Mesos have more traction
      • Singularity and Shifter
    • Security
      • Authentication and authorization
        • Oauth support in Workbench
        • LDAP or Oauth scopes for authorization
        • Q. What does it mean to share authentication/authorization with 
          • HPC cluster such as ROGER (NCSA LDAP)
          • Spark cluster? (Kerberos or nothing, what about Livy)
      • Container and filesystem users/permissions
        • Mapping UID/GIDs into running containers
        • Q. Root escalation in Docker
      • Data sharing and permissions
        • Controlling access to data on the filesystem but also in active database such as NBI mongo.
        • Same problem with Spark (pulling data from Mongo into Spark)
      • More sophisticated network configuration
        • Both for Kubernetes
        • Also with the cloud provider (e.g., OpenStack project)
    • Monitoring/LMA
    • Production maintenance
      • Upgrading the beta instance
      • Redeploying the beta?
      • /var/lib/docker volume scaling
      • Fixing inactiveTimeout for inactive accounts
      • ETK/earthcube instances
      • Cloud9 wily (and other Wily) upgrades; use Cloud9 all
      • Process for removing old specs if in use.
    • Clowder/Workbench plugin?
    • Evaluate FRDR
    • Other
      • Drop-in UI nonsense
      • Bower bug 410
      • Deploy tools using configmap?
      • Deploy tools using SMTP and standalone etcd
      • /var/lib/docker and kubelet mount issues (may need depends on)
      • Why are we using XFS for /media/storage?

Notes:

  • Discussion of security in Spark
    • Kevin: focused on network access control
  • Getting off of the demo treadmill
  • Need to really understand OpenShift
    • Security model,
    • Application/deployment model
  • Easy Kubernetes deploy
    • OpenStack (Nebula/SDSC)
    • HA?
    • Secure? Networking, TLS?
    • Then in AWS, Azure (Big Data Hubs), GCE
  • Security is the biggest thing for now
    • Globus authentication
  • TERRA-REF Use Case:
    • Auth into Workbench: ideally this would SSO with Clowder/BETYdb/ROGER – same user/password. In the end, this is LDAP/NCSA Identity
    • Container and filesystem permissions
      • RunAs me
      • Write files as me in my project
      • PAM/SSSD in container
    • Restrict access to some data to some users
      • See sample data, but not full set
      • SciServer – ACLs and data is only mounted in container if you are authorized
    • One cluster: Workbench + Extractors
      • Extractors need read-write access to core filesystems
      • Users can have RO to core filesystems (shared data)
    • TERRA has users directory on ROGER that I can mount via NFS
      • Replace GlusterFS 
    • SSO via Oauth: need to do the work
    • Authorization: where it all gets hairy
      • Need ACLs
      • Handle UID/GID
    • Max needs to be able to run extractors
      • Today, he needs to ssh into master
        • Pile of extractor yaml files in admin repo
        • "terraref" namespaces
        • Extractors nodeSelector – nodes have RW access to the core data
      • Hardcoded UIDs into the contaienr to run as filesystem owners
        • Force the "RunAsX" model of OpenShift
      • Kubernetes RBAC


Sprint 34 priorities:

  • SC17 demo
  • Production issues
  • OpenShift eval
  • Easy Kubernetes install on OpenStack?
  • Oauth: things to do
    • LDAP authentication?
    • Change ApiServer
  • Authorization model
    • TERRA-USE case (who has access to what data)
    • Who gets to control that (admin role)


  • No labels