Skip to content

Commit

Permalink
commit deploy
Browse files Browse the repository at this point in the history
  • Loading branch information
AbhishekSingh1180 committed Feb 25, 2024
1 parent 9cfc6b9 commit da01dc5
Show file tree
Hide file tree
Showing 7 changed files with 118 additions and 16 deletions.
16 changes: 15 additions & 1 deletion .github/workflows/deploy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ jobs:
env:
GCP_PROJECT_SECRET: ${{ secrets.PROJECT_SECRET }}
GCS_SECRET: ${{ secrets.GCS_SECRET }}
GCS_IAM_SECRET: ${{ secrets.GCS_IAM_SECRET }}
SNOWFLAKE_SECRET: ${{ secrets.SNOWFLAKE_SECRET }}
BQ_SECRET: ${{ secrets.BQ_SECRET }}

steps:
- name: Checkout code
Expand All @@ -23,5 +26,16 @@ jobs:
with:
credentials_json: ${{ secrets.GCP_SA_KEY }}

- name: Install SnowSQL
run: |
curl -O "https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/linux_x86_64/snowsql-1.2.31-linux_x86_64.bash"
SNOWSQL_DEST=~/bin SNOWSQL_LOGIN_SHELL=~/.profile bash snowsql-1.2.31-linux_x86_64.bash
~/bin/snowsql --version
read -r SNF_ACCOUNT SNF_USERNAME SNF_PASSWORD <<< $(echo "$SNOWFLAKE_SECRET" | tr ',' ' ')
echo "[connections.awesome]
accountname = $SNF_ACCOUNT
username = $SNF_USERNAME
password = $SNF_PASSWORD" > ~/.snowsql/config
- name: Execute deployment script
run: bash resources/Deployment/deployment.sh $GCP_PROJECT_SECRET $GCS_SECRET
run: bash resources/Deployment/deployment.sh $GCP_PROJECT_SECRET $GCS_SECRET $GCS_IAM_SECRET $BQ_SECRET
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@

![Diagram](https://github.com/AbhishekSingh1180/sensex-data-analysis/blob/main/Diagram/sensex-data-analysis.png)

Leveraged GitHub Actions for streamlined data extraction from Yahoo finance API, Snowflake and SnowSQL for extraction, storage, and transformation for Sensex Data.
Leveraged GitHub Actions for streamlined data extraction from Yahoo finance API, Snowflake and SnowSQL for extraction, storage, and transformation for Sensex Data.
68 changes: 63 additions & 5 deletions resources/Deployment/deployment.sh
Original file line number Diff line number Diff line change
@@ -1,15 +1,73 @@
#!/bin/bash

FOLDER_PATH='resources/Storage/*'
BQ_SCHEMA_PATH='resources/Storage/BigQuery/bq_schema.json'

# Set project ID and region from Project Secrets
read -r PROJECT_NAME REGION <<< $(echo "$1" | tr ',' ' ')

# Set GCS Variables from GCS Secrets
read -r BUCKET_NAME GCS_FOLDER <<< $(echo "$2" | tr ',' ' ')
read -r BUCKET_NAME GCS_SINK_FOLDER GCS_ARCHIVE_FOLDER <<< $(echo "$2" | tr ',' ' ')
# SET GCS iam policy Variables from GCS IAM Secrets
read -r CUSTOM_ROLE PERMISION_1 PERMISION_2 PERMISION_3 <<< $(echo "$3" | tr ',' ' ')
# Set BQ Variables from BQ Secrets
read -r DATASET_NAME TABLE_NAME <<< $(echo "$4" | tr ',' ' ')

#----------------------------------------------------------------------------------------------------------

# # STEP 1 : Setup GCS bucket

# Create GCS bucket
gcloud storage buckets create gs://$BUCKET_NAME --project=$PROJECT_NAME --location=$REGION --no-public-access-prevention --no-uniform-bucket-level-access
gcloud storage buckets create gs://$BUCKET_NAME /
--project=$PROJECT_NAME /
--location=$REGION /
--public-access-prevention /
--uniform-bucket-level-access

# CP setup files to GCS bucket
gcloud storage cp -r $FOLDER_PATH gs://$BUCKET_NAME/
echo "$(date) - SINK" | gcloud storage cp - gs://$BUCKET_NAME/$GCS_SINK_FOLDER/init.txt
echo "$(date) - ARCHIVE" | gcloud storage cp - gs://$BUCKET_NAME/$GCS_ARCHIVE_FOLDER/init.txt

#----------------------------------------------------------------------------------------------------------

# STEP 2 : Setup storage integration for snowflake out stage to GCS for file transfering

EXECUTE_SQL_SCRIPT="execute.sql"

echo "CREATE STORAGE INTEGRATION GCS_STORAGE_INT /
TYPE = EXTERNAL_STAGE /
STORAGE_PROVIDER = 'GCS' /
ENABLED = TRUE /
STORAGE_ALLOWED_LOCATIONS = ('gcs://$BUCKET_NAME/$GCS_SINK_FOLDER/');"
>> $EXECUTE_SQL_SCRIPT

echo "CREATE STAGE FINANCE_DB.DW_APPL.SENSEX_DATA_STAGE_OUT /
URL = 'gcs://$BUCKET_NAME/$GCS_SINK_FOLDER/' /
STORAGE_INTEGRATION = GCS_STORAGE_INT /
FILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1);"
>> $EXECUTE_SQL_SCRIPT

~/bin/snowsql --config ~/.snowsql/config --connection awesome -w "COMPUTE_WH" -f $EXECUTE_SQL_SCRIPT
rm -rf $EXECUTE_SQL_SCRIPT

# Retrieve service account
SNF_SERVICE_ACCOUNT=$(~/bin/snowsql --config ~/.snowsql/config --connection awesome -w "COMPUTE_WH" -q "DESC STORAGE INTEGRATION GCS_STORAGE_INT" /
-o output_format=csv -o header=false | awk 'NR==7' | cut -d',' -f3 | tr -d '"' )

gcloud iam roles create $CUSTOM_ROLE /
--project=$PROJECT_NAME /
--title="Custom Snowflake GCS Writer" /
--description="Custom role with minimal permissions for Snowflake to load data into GCS" /
--permissions=$PERMISION_1,$PERMISION_2,$PERMISION_3

#
gcloud storage buckets add-iam-policy-binding gs://$BUCKET_NAME --member=serviceAccount:$SNF_SERVICE_ACCOUNT /
--role=projects/$PROJECT_NAME/roles/$CUSTOM_ROLE --project=$PROJECT_NAME

#----------------------------------------------------------------------------------------------------------

# STEP 3 : Setup BigQuery Dataset and table
bq --location=$REGION mk -t $PROJECT_NAME:$DATASET_NAME
bq --location=$REGION mk -t $PROJECT_NAME:$DATASET_NAME.$TABLE_NAME $BQ_SCHEMA_PATH

# cp local folder to GCS bucket
gcloud storage cp -r $FOLDER_PATH gs://$BUCKET_NAME/
#----------------------------------------------------------------------------------------------------------
10 changes: 10 additions & 0 deletions resources/Storage/BigQuery/bq_schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[
{ "name": "index", "type": "STRING", "mode" : "required"},
{ "name": "symbol", "type": "STRING", "mode" : "required" },
{ "name": "timestamp", "type": "TIMESTAMP", "mode" : "required" },
{ "name": "open", "type": "FLOAT", "mode" : "required" },
{ "name": "close", "type": "FLOAT", "mode" : "required" },
{ "name": "high", "type": "FLOAT", "mode" : "required" },
{ "name": "low", "type": "FLOAT", "mode" : "required" },
{ "name": "volume", "type": "INTEGER", "mode" : "required" }
]
10 changes: 10 additions & 0 deletions resources/Storage/Dataflow/dataflow_bqschema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{ "BigQuery Schema": [
{ "name": "index", "type": "STRING" },
{ "name": "symbol", "type": "STRING" },
{ "name": "timestamp", "type": "TIMESTAMP" },
{ "name": "open", "type": "FLOAT" },
{ "name": "close", "type": "FLOAT" },
{ "name": "high", "type": "FLOAT" },
{ "name": "low", "type": "FLOAT" },
{ "name": "volume", "type": "INTEGER" }
] }
19 changes: 19 additions & 0 deletions resources/Storage/Dataflow/transform.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
function transform(line) {

if (line.trim().toLowerCase().startsWith('index')) {
return;
}
var values = line.split(',');
var obj = {};
obj.INDEX = values[0];
obj.SYMBOL = values[1];
obj.TIMESTAMP = values[2];
obj.OPEN = parseFloat(values[3]);
obj.CLOSE = parseFloat(values[4]);
obj.HIGH = parseFloat(values[5]);
obj.LOW = parseFloat(values[6]);
obj.VOLUME = parseInt(values[7]);

var jsonString = JSON.stringify(obj);
return jsonString;
}
9 changes: 0 additions & 9 deletions resources/Storage/snowflake-sink/apcha.csv

This file was deleted.

0 comments on commit da01dc5

Please sign in to comment.