The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.
-
Updated
Jul 6, 2022 - Jupyter Notebook
The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.
Digital Innovation One - Desafio GCP Dataproc. O desafio consiste em efetuar um processamento de dados utilizando o produto Dataproc do GCP. Esse processamento irá efetuar a contahem das palavras de um livro e informar quantas vezes cada palavra aparece no mesmo.
Logistic regression modeling of swing state voter turnout to support U.S. political campaign proposals
Monte Carlo simulations with PySpark on GCP Cloud Dataproc clusters
GCP Dataproc mapreduce sample with PySpark
Repositorio para realizar el curso en Udemy llamado "Airflow2.0 De 0 a Héroe", de la academia "Datapath".
Add a description, image, and links to the dataproc-clusters topic page so that developers can more easily learn about it.
To associate your repository with the dataproc-clusters topic, visit your repo's landing page and select "manage topics."