-
Task
-
Resolution: Fixed
-
Normal
-
None
-
None
-
NDS Sprint 28
See also https://opensource.ncsa.illinois.edu/confluence/display/NDS/ThinkChicago
We need to gather some information to characterize potential workloads for the ThinkChicago hackathon. Unfortunately, we don't have much information to go on. Organizers have given us a list of prompts and datasets (sent via email). They current thinking is to provide web-based IDEs to allow users to respond to prompts (open ended). We need to have some sense of how much memory/CPU to allocate per user and per container to avoid problems encountered with PI4.
This is an open-ended task, but here are some thoughts:
- Collect information about each of the datasets listed in the prompts. See if we can download them (should be in City of Chicago data portal https://data.cityofchicago.org/. Document basic characteristics in Wiki or this ticket (for each dataset, how big is it, format, etc.).
- Think through one or two prompts and how you might address them. Ideally, prototype something (very open ended) using the existing Cloud9 containers in Labs Workbench for the largest dataset. A few things to think through: does is make sense to use Labs Workbench? Do you have enough resources to do what you need (CPU/RAM)? Does the container have sufficient resources?
Feel free to adjust task as needed.