-
Notifications
You must be signed in to change notification settings - Fork 0
/
notes.txt
38 lines (28 loc) · 1.41 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Notes
Goals and objectives
airports dataset
1. data transformation
--> include new data information (e.g. adding 'City population' / 'airport size' column)
--> add timestamp as new column
airport-comments dataset
1. data transformation
--> add timestamp as new column
Conda + VS Code
--> https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
--> https://code.visualstudio.com/docs/python/environments
--> https://datumorphism.leima.is/til/programming/python/python-anaconda-install-requirements/
--> https://linuxhint.com/conda-install-requirements-txt/
ex: conda install pandas
--> https://stackoverflow.com/questions/51042589/conda-version-pip-install-r-requirements-txt-target-lib
potential orchestration tools explored:
1. Apache Airflow
- Airflow with Astronomer
2. Mage.ai
https://docs.astronomer.io/learn/get-started-with-airflow
28/6/2023 - Experimented Airflow with Astronomer with newly developed dag.
Reference on Issues Resolved
1. TypeError: QueryJob._blocking_poll() got an unexpected keyword argument 'retry'
- Resolved with --> https://stackoverflow.com/questions/74673148/bigquery-client-using-python-timeout-and-polling-issues
- pip3 install google-cloud-bigquery==3.0.1
2. gspread.exceptions.APIError: {'code': 400, 'message': 'Invalid value at \'data.values[0]\' (type.googleapis.com/google.protobuf.ListValue),
- gsheet.py changes made to --> worksheet.update('A1', csv_data)