A Professional Data Engineer enables data-driven decision making by collecting, transforming, and publishing data. A Data Engineer should be able to design, build, operationalize, secure, and monitor data processing systems with a particular emphasis on security and compliance; scalability and efficiency; reliability and fidelity; and flexibility and portability. A Data Engineer should also be able to leverage, deploy, and continuously train pre-existing machine learning models.
The GCP Professional Data Engineer exam assesses your ability to:
- Design data processing systems
- Build and operationalize data processing systems
- Operationalize machine learning models
- Ensure solution quality
In the Theory chapter are presented my personnal notes that summerize everything one need to know in order to get prepared for this exam/certification. Then in the Practice section, you'll find all the capstone projects i've accomplished to master the different GCP components.
- All Services
- Selection of the appropriate Storage Technology
- Build a Storage System & Operations
- Design Data Pipelines
- Data Processing Solutions TO BE CONTINUED
- Build an Instrastructure & Operations
- Security & Compliance TO BE CONTINUED
- DB - Reliability, Scalability & Availability
- Flexibility & Portability
- ML Pipelines
- Choosing the Appropriate Insfrastructure
- Measure, Monitore & Troubleshoot ML
- Prebuilt ML Models as a Service
- Hadoop & Differences with GCP components
The M.L part is not developped here since i'm already familliar with all the Data Science concepts. Anyway you can refer to the Machine Learning cheatsheets for Stanford's CS 229 and a local backup here
- will be updated later
- Qwliks
- GCP Essentials - Official links - Local working dir
- Baseline: Data, ML, AI - Official links - Local working dir
- Data Engineering - Official links - Local working dir
- Engineer Data in Google Cloud - Official links - Local working dir
- GCP solutions