K-means clustering algorithm using MapReduce.
-
Updated
Sep 14, 2024 - Python
K-means clustering algorithm using MapReduce.
Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University
Mastering Data Science
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real-world tasks…
This repository aims to develop a basic search engine utilizing Hadoop's MapReduce framework to index and process extensive text corpora efficiently. The dataset used for this project is a subset of the English Wikipedia dump, totaling 5.2 GB in size. The project focuses on implementing a naive search algorithm to address challenges in information.
BigData Workshop - Python MapReduce for word frequency analysis on varied datasets.
Modified from big-data-europe/docker-hadoop
Programs for MapReduce written in java with least complexity!
Big Data analysis project using MapReduce in Python to process movie ratings. Includes scripts for aggregating ratings and identifying the most rated movies, demonstrating data analysis on a large scale.
Performing Map reduce to get the page rank on the WDC data.
Pulled 10GB ofYelp Business data through the terminal via Kaggle API. The data was then pushed to and AWS S3 Bucket bucket for storage and analyzed on a Elastic MapReduce Cluster on a Jupyter Notebook using PySpark
Project on MapReduce for the Μ111 - Big Data Management course, NKUA, Spring 2023.
Big data training material
Average age of male and female died in Titanic using MapReduce programming in Python
Sum and count of odd and even numbers using a map reduce program in Python
During this lab, I worked on big data by making some little exercises with mapping and reduce.
During this lab, I worked on big data by making some harder exercises with mapping and reduce.
Apache Hadoop docker image | Running Python MapReduce
Add a description, image, and links to the mapreduce-python topic page so that developers can more easily learn about it.
To associate your repository with the mapreduce-python topic, visit your repo's landing page and select "manage topics."