Skip to content

Commit

Permalink
Updated GCP readme, AWS readme and added links to main readme -AO
Browse files Browse the repository at this point in the history
  • Loading branch information
Olvera-Morales committed Oct 1, 2024
1 parent 3a5308d commit 04f13e2
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 9 deletions.
6 changes: 3 additions & 3 deletions AWS/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,14 +51,14 @@ This module is geared towards beginners and does not require prior knowledge on

## Before Starting

Included is a tutorial in the form of Jupyter notebooks. The main purpose of the tutorial is to help beginners without much coding experience to familiarize themselves with basic fundamental concepts within machine learning using health data (COVID dataset). It is also meant to be extended to other kinds of structured data. The tutorial walks through step by step the process of creating a Decision Tree and interpreting it. This module intends to provide an intuitive understanding of how machine learning model performance is evaluated. In order to get to this module from the AWS Cloud, you will need to have access to an AWS account, this module is located within Amazon SageMaker. For more technical information about AWS please click on the following link: [NIH Cloud Lab README](https://github.com/STRIDES/NIHCloudLabAWS)
Included is a tutorial in the form of Jupyter notebooks. The main purpose of the tutorial is to help beginners without much coding experience to familiarize themselves with basic fundamental concepts within machine learning using health data (COVID dataset). It is also meant to be extended to other kinds of structured data. The tutorial walks through step by step the process of creating a Decision Tree and interpreting it. This module intends to provide an intuitive understanding of how machine learning model performance is evaluated. In order to get to this module from the AWS Cloud, you will need to have access to an AWS account, this module is located within Amazon SageMaker. For more technical information about AWS please click [this link.](https://github.com/STRIDES/NIHCloudLabAWS)


## **Getting Started**

**1)** Follow the steps highlighted [here](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAWSSagemakerNotebooks.md) to create a new fully managed notebook in Amazon SageMaker. Follow steps and be especially careful to enable idle shutdown as highlighted. For this module, in [step 4](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAWSSagemakerNotebooks.md#:~:text=Give%20a%20name%20to%20your%20notebook.%20Choose%20a%20notebook%20instance%20type%20based%20on%20needs%2C%20Amazon%20Linux%202%20as%20platform%20identifier%2C%20volume.%20Optional%2C%20create%20idle%2Dshut%20by%20selecting%20create%20new%20lifecycle%20configuration%20and%20copy%20and%20paste%20idle%2Dshutdown.sh%20and%20create%20configuration.%20Then%20click%20Create%20notebook%20instance%3A) in the "Notebook instance type" tab, select ml.m5.xlarge from the dropdown box. Select conda_python3 kernel in [step 8](https://github.com/NIGMS/NIGMS-Sandbox/blob/AWS%26GCP/docs/HowToCreateAWSSagemakerNotebooks.md#:~:text=Select%20a%20notebook%20and%20then%20kernel%3A).
**1)** Follow the steps highlighted [here](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAWSSagemakerNotebooks.md) to create a new user-managed notebook in Amazon SageMaker. Follow steps and be especially careful to enable idle shutdown as highlighted. For this module, in [step 4](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAWSSagemakerNotebooks.md#:~:text=Give%20a%20name%20to%20your%20notebook.%20Choose%20a%20notebook%20instance%20type%20based%20on%20needs%2C%20Amazon%20Linux%202%20as%20platform%20identifier%2C%20volume.%20Optional%2C%20create%20idle%2Dshut%20by%20selecting%20create%20new%20lifecycle%20configuration%20and%20copy%20and%20paste%20idle%2Dshutdown.sh%20and%20create%20configuration.%20Then%20click%20Create%20notebook%20instance%3A) in the "Notebook instance type" tab, select ml.m5.xlarge from the dropdown box. Select conda_python3 kernel in [step 8](https://github.com/NIGMS/NIGMS-Sandbox/blob/AWS%26GCP/docs/HowToCreateAWSSagemakerNotebooks.md#:~:text=Select%20a%20notebook%20and%20then%20kernel%3A).

**2)** Now you will need to download the tutorial files from GitHub. The easiest way to do this would be to clone the repository from NIGMS into your SageMaker notebook. To clone this repository, use the `Git` command `git clone https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology.git` in the terminal as it illustrated in [step 9](https://github.com/NIGMS/NIGMS-Sandbox/blob/SageMakerTutorial/docs/HowToCreateAWSSagemakerNotebooks.md#:~:text=To%20clone%20in,see%20the%20repo). Please make sure you only enter the link for the repository that you want to clone. There are other bioinformatics related learning modules available in the [NIGMS Repository](https://github.com/NIGMS). This will download our tutorial files into a folder called `Introduction-to-Data-Science-for-Biology`.
**2)** Now you will need to download the tutorial files from GitHub. The easiest way to do this would be to clone the repository from NIGMS into your Amazon SageMaker notebook. To clone this repository, use the `Git` command `git clone https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology.git` in the terminal as it illustrated in [step 9](https://github.com/NIGMS/NIGMS-Sandbox/blob/SageMakerTutorial/docs/HowToCreateAWSSagemakerNotebooks.md#:~:text=To%20clone%20in,see%20the%20repo). Please make sure you only enter the link for the repository that you want to clone. There are other bioinformatics related learning modules available in the [NIGMS Repository](https://github.com/NIGMS). This will download our tutorial files into a folder called `Introduction-to-Data-Science-for-Biology`.

**IMPORTANT NOTE**

Expand Down
10 changes: 5 additions & 5 deletions README_GCP.md → Google Cloud/README_GCP.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## Introduction to Machine Learning for COVID Predictions
## Introduction to Machine Learning for COVID Predictions for Google Cloud
---------------------------------

## Contents
Expand Down Expand Up @@ -51,23 +51,23 @@ This module is geared towards beginners and does not require prior knowledge on

## Before Starting

Included is a tutorial in the form of Jupyter notebooks. The main purpose of the tutorial is to help beginners without much coding experience to familiarize themselves with basic fundamental concepts within machine learning using health data (COVID dataset). It is also meant to be extended to other kinds of structured data. The tutorial walks through step by step the process of creating a Decision Tree and interpreting it. This module intends to provide an intuitive understanding of how machine learning model performance is evaluated. In order to get to this module from the Google Cloud Platform, you will need to have access to a Google Cloud Platform account, this module is located within Vertex AI Workbench. For more technical information about Google Cloud Platform please click on the following link: [NIH Cloud Lab README](https://github.com/STRIDES/NIHCloudLabGCP)
Included is a tutorial in the form of Jupyter notebooks. The main purpose of the tutorial is to help beginners without much coding experience to familiarize themselves with basic fundamental concepts within machine learning using health data (COVID dataset). It is also meant to be extended to other kinds of structured data. The tutorial walks through step by step the process of creating a Decision Tree and interpreting it. This module intends to provide an intuitive understanding of how machine learning model performance is evaluated. In order to get to this module from the Google Cloud Platform, you will need to have access to a Google Cloud Platform account, this module is located within Vertex AI Workbench. For more technical information about Google Cloud Platform please click on [this link.](https://github.com/STRIDES/NIHCloudLabGCP)

## **Getting Started**

**1)** Please click on the link for steps to open your GCP project: [How to open your GCP Project](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/open_project_intramural.md).

**2)** Follow the steps highlighted [here](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md) to create a new user-managed notebook in Vertex AI. Follow steps 1-8 and be especially careful to enable idle shutdown as highlighted in step 7. For this module you should select Debian 11 and Python 3 in the Environment tab in step 5. In step 6 in the Machine type tab, select n1-standard-4 from the dropdown box.
**2)** Follow the steps highlighted [here](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md) to create a new instance notebook in Vertex AI. Follow steps 1-8 and be especially careful to enable idle shutdown as highlighted in [step 7](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md#:~:text=On%20the%20same%20page%2C%20click%20Enable%20Idle%20Shutdown%20and%20specify%20the%20idle%20minutes%20for%20shutdown.%20This%20means%2C%20if%20you%20close%20your%20browser%20and%20walk%20away%20without%20stopping%20your%20instance%2C%20it%20will%20shutdown%20automatically%20after%20this%20many%20minutes.%20We%20recommend%2030%20minutes.). For this module you should select Debian 11 and Python 3 in the Environment tab in [step 5](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md#:~:text=On%20the%20Environment%20tab%2C). In [step 6](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md#:~:text=GPU%20use.-,Under%20Machine%20type,-select%20your%20desired) in the Machine type tab, select n1-standard-4 from the dropdown box.

**3)** Now you will need to download the tutorial files from GitHub. The easiest way to do this would be to clone the repository from NIGMS into your Vertex AI notebook. This can be done by using the `Git` menu in JupyterLab, and selecting the clone option. To clone this repository, use the Git command `git clone https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology.git` in the dropdown menu option in Jupyter notebook. Please make sure you only enter the link for the repository that you want to clone. There are other bioinformatics related learning modules available in the [NIGMS Repository](https://github.com/NIGMS). This will download our tutorial files into a folder called `Introduction-to-Data-Science-for-Biology`.

**IMPORTANT NOTE**

Make sure that after you are done with the module, close the tab that appeared when you clicked **OPEN JUPYTERLAB**, then check the box next to the name of the notebook you created in step 3. Then click on **STOP** at the top of the Workbench menu. Wait and make sure that the icon next to your notebook is grayed out.
Make sure that after you are done with the module, close the tab that appeared when you clicked **OPEN JUPYTERLAB**, then check the box next to the name of the notebook you created in [step 3](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md#:~:text=Click%20Create%20New-,Select,-Advanced%20Options%20at). Then click on **STOP** at the top of the Workbench menu. Wait and make sure that the icon next to your notebook is grayed out.

## **Software Requirements**

Software requirements are satisfied by using a pre-made Google Cloud Platform environment Workbench Notebook. The notebook environment used is named **"Python 3 with Intel® MKL"** ; and it is listed during Step 3 for accessing our module. In addition all package requirements are installed by following the instructions Step 1 of the notebook **"Intro to Machine Learning Decision Trees".**
Software requirements are satisfied by using a pre-made Google Cloud Platform environment Workbench Notebook. The notebook environment used is named **"Python 3 with Intel® MKL"** ; and it is listed during Step 3 for accessing our module. Software requirements are described in notebook **"Intro to Machine Learning Decision Trees"** step 1.


## **Architecture Design**
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

This module is geared towards beginners and does not require prior knowledge on a specific scientific discipline. The module is divided into three Jupyter notebooks as outlined at the beginning of this document. In addition to the notebooks mentioned, there are videos containing brief explanations about basic concepts in machine learning and what the code does in each step of the notebook. Below is an outline of the videos contained in each notebook with their respective links. These videos are already attached to the notebook.

This module offers two computing pathways: AWS (Amazon Web Services) or GCP (Google Cloud Platform). Users can choose their preferred cloud service to run the Jupyter notebooks, ensuring flexibilty and accessibilty based on their existing infrastructure or familairty. Detailed instructions for setting up and using either AWS or GCP for this module are provided within their corresponding folders within this repository.
This module offers two computing pathways: [AWS (Amazon Web Services)](https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology/tree/AWS%26GCP/AWS) or [GCP (Google Cloud Platform)](https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology/tree/AWS%26GCP/Google%20Cloud). Users can choose their preferred cloud service to run the Jupyter notebooks, ensuring flexibilty and accessibilty based on their existing infrastructure or familairty. Detailed instructions for setting up and using either AWS or GCP for this module are provided within their corresponding folders within this repository.

### 1- Introduction To Machine Learning: Decision Trees (10 video clips)

Expand Down

0 comments on commit 04f13e2

Please sign in to comment.