Building-Cyber-Infrastructure-for-Researchers

Mentors: Abraham Matta(matta@bu.edu), Ali Raza (araza@bu.edu)

Team Members: Tian Chen（ct970808@bu.edu), Donovan Jones（jonesde@bu.edu), Komal Kango（komalk@bu.edu), Jing Song (jingsong@bu.edu), Kristi Perreault（kristip@bu.edu)

1. Vision and Goals Of The Project

The system will be a cloud based infrastructure that runs code written in Python or R over specified data sets to create and compare models that predict ecological forecasts. High-Level Goals for this Project include:

Providing a web service with a simple user experience such that researchers can submit code and periodically run it on data sets.
Developing reliable infrastructure on unreliable nodes using a Kubernetes Cluster
Focusing on Function as a Service with OpenWhisk as a proof of concept
Providing a user interface that allows for comparisons between multiple models on the same data set along with comparisons of models using periodic data sets in order to determine model accuracy.
- Due to unforeseen circumstances, this goal has been modified to allow for a simple response from OpenWhisk. Anything delivered further is viewed as a stretch goal.

2. Users/ Personas of the Project

The system will be deployed by system administrators and used by the end-users in the earth science department of BU. It targets end-users, specifically ecological researchers. It does not target:

Non-ecological Researchers
Advanced users with complex requirements beyond the scope of the project.

3. Scope and Features of the Project

UI: an easy-to-use web interface for users

System Administrator: developers
- Approve project proposals from project leads
- Assign users as project leads
- Management of all existing projects and users in the system
- Access to information about the current VM cluster*
Project Leads: usually professors
- Approve requests from users to join projects (request received through email)
- Add or remove team members
- View the working progress of team members*
Team Members: usually graduate students
- Could be assigned as project leads/administrators
Front-end features for all users
- User registration and login
- Allows changing password
- Allows search for specific projects and requests to join
- Management of all previous logs of computations*
- Allows submission of new computation code
- Code submitted by users with code editor
- Included "submit" button to start executing the code
- Standardized visualization of output from running the code (extra z-axis in the plot being the the confidence of prediction)*
- Comparison of different models over time on the same set of data*
- Accuracy measurement of the prediction model by comparing it with real-time updated data*

Unreliable Nodes:

Utilize virtual machine in the MOC along with Chameleon and GENI as the unreliable nodes to build reliable infrastructure
Capability for infrastructure to "loan out" these nodes to services or applications as needed*
Monitor the availability of these nodes, including up/down time and proximity to data stores*

Orchestration with Kubernetes:

Allows for a consistent layer over which anything can be deployed. For this project, the focus is Function as a Service (FaaS) with OpenWhisk
Determine where the code from the researchers will run depending on where it is stored*
Deploy Kubernetes Cluster on the Mass Open Cloud
Provide a view of available nodes, including locations to data stores*

Compute with OpenWhisk:

Run OpenWhisk on Kubernetes cluster as first service offering for the Cyber Infrastructure platform
Commands sent to OpenWhisk, which is running on a cluster determined by Kubernetes based on node availability and location

Database Management:

User information stored and managed in MongoDB
Store output of computation in DynamoDB (Dynamo allows computation configuration)

Security: provide secure storage of user data and computation output

* Note: Due to unforeseen circumstances in Spring of 2020, any starred items are now considered stretch goals of the project

4. Solution Concept

The main issue the team is attempting to solve is the unreliability of Chameleon and GENI nodes for running researcher's code. Chameleon and GENI are used due to their low cost, but the trade off is there is low availability. By building an infrastructure layer over these nodes, and utilizing a Kubernetes cluster to orchestrate the use of nodes, researchers will be able to rely on this system to compute and store their data without having to overpay. As a proof of concept, the team will build this infrastructure layer with ecological researchers at BU in mind, and will first attempt to install OpenWhisk on the Kubernetes cluster to test the function as a service avenue. In addition, a basic UI will be provided to allow researchers to input code and compare data models, and system admins to manage access requests.

5. Acceptance Criteria

The minimum acceptance criteria is an infrastructure service running OpenWhisk in a Kubernetes cluster, deployed to the MOC. The Kubernetes cluster orchestrates where the code runs based on node availability and proximity to where the data is stored. In addition, a basic UI is provided for users to submit code, and system admin to monitor access requests. Stretch goals are:

Preview of the node locations and availability surfaced through the UI
Optimization of node location and data store proximity
More robust user experience

6a. Release Planning - Proposed

For full release plans, please visit the team’s project space: https://tree.taiga.io/project/mosayyebzadeh-building-cyber-infrastructure-for-researchers/timeline

Release #1 (due week 4)

Project goals determined and understood
Front end framework determined
New project created & outlined
- User submission
- User registration/login
- Results display
- Admin system

Release #2 (due week 6)

UI Component
- User able to register for an account with email
- User able to login
- User data stored in MongoDB database
Backend/Cloud Component
- Install Openwhisk on a cluster
- Stand up a function to run on the cluster

Release #3 (due week 8)

UI Component
- User can upload code in R via container link/code link
- Admin portal created with ability to manage users
Backend/Cloud Component
- Ability to add and remove the unreliable Chameleon nodes

Release #4 (due week 10)

Submission portal
- User can upload code in python
Results display
- Data visualization results compared to different models
Admin system
- View system health
O&S
- Distribute code
- Monitor VM availability

Release #5 (due week 12)

Results display
- Trigger set by users
- Ability to view previous results
- Real time data visualization

Release #6 (due week 14) - TBD

Release #7 (due week 16) - TBD

6b. Release Planning - Actual

Release #1 (due week 2)

Project goals determined and understood
Front end framework determined
New project created & outlined
- User submission
- User registration/login
- Results display
- Admin system

Release #2 (due week 4)

UI Component
- User login & registration
- Dashboard
Backend/Cloud Component
- Access to MOC and OpenWhisk
- OpenWhisk running on one VM in MOC
- Code standards & team working agreement

Release #3 (due week 6)

UI Component
- Code submission page in UI
Backend/Cloud Component
- OpenWhisk API mock call from UI
- OpenWhisk on Kubernetes work started

Release #4 (due week 8)

UI Component
- Users can request to join projects (started)
- Email request sent to project admin to confirm or deny access request
- Response from OpenWhisk displayed to user
- User Hierarchy (started)
Backend/Cloud Component
- Simple "HelloWorld" OpenWhisk API call is made from the UI
- OpenWhisk executes code and returns simple response to user
- Kubernetes cluster created

Release #5 (due week 10)

UI Component
- Users can request to join projects (finished)
- Email request sent to project admin to confirm or deny access request
- User Hierarchy Part (finished)
- Data visualization in UI
- User can select variables to plot from result data
Backend/Cloud Component
- Complex (Matrix Multiply) OpenWhisk API call is made from the UI
- OpenWhisk executes code and returns results to user
- OpenWhisk installed on Kubernetes cluster

Release #6 (due week 12) - TBD

6c. Overall project - Actual and future work

we build a kubernetes based on kind cluster and deploy openwhisk on it, and build a web sever page based on flask, using openwhisk api to interact with openwhisk. The protype only contains create, update action and invoker a action.

7. Sprint Presentations

Sprint 1

Presentation Slides

Sprint 2

Sprint 3

Dynamo Paper Presentation

Presentation Slides

Sprint 4

Sprint 5

Final Demo

8. Setup Manual

Setup manual has instructions for setting up the system on any cloud environment.[link]

9. Related link

previous project[link]

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.idea		.idea
Presentations		Presentations
__pycache__		__pycache__
docs		docs
static		static
templates		templates
venv		venv
README.md		README.md
Solution_Diagram.png		Solution_Diagram.png
SystemSetUp.md		SystemSetUp.md
app.py		app.py
helloworld.py		helloworld.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building-Cyber-Infrastructure-for-Researchers

Mentors: Abraham Matta(matta@bu.edu), Ali Raza (araza@bu.edu)

Team Members: Tian Chen（ct970808@bu.edu), Donovan Jones（jonesde@bu.edu), Komal Kango（komalk@bu.edu), Jing Song (jingsong@bu.edu), Kristi Perreault（kristip@bu.edu)

1. Vision and Goals Of The Project

2. Users/ Personas of the Project

3. Scope and Features of the Project

4. Solution Concept

5. Acceptance Criteria

6a. Release Planning - Proposed

6b. Release Planning - Actual

6c. Overall project - Actual and future work

7. Sprint Presentations

8. Setup Manual

9. Related link

About

Releases

Packages

Languages

jonesdebu/Building-Cyber-Infrastructure-for-Researchers

Folders and files

Latest commit

History

Repository files navigation

Building-Cyber-Infrastructure-for-Researchers

Mentors: Abraham Matta(matta@bu.edu), Ali Raza (araza@bu.edu)

Team Members: Tian Chen（ct970808@bu.edu), Donovan Jones（jonesde@bu.edu), Komal Kango（komalk@bu.edu), Jing Song (jingsong@bu.edu), Kristi Perreault（kristip@bu.edu)

1. Vision and Goals Of The Project

2. Users/ Personas of the Project

3. Scope and Features of the Project

4. Solution Concept

5. Acceptance Criteria

6a. Release Planning - Proposed

6b. Release Planning - Actual

6c. Overall project - Actual and future work

7. Sprint Presentations

8. Setup Manual

9. Related link

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages