Big Data Coursework

The objective of this project is to create a search engine that takes in keywords entered by the user and list out the top ten search results from a list of articles from a database. This project implements Spark Map-Reduce to complete the project.

Instruction

Be reminded, this is only for test. Before we start, make sure you've already installed hadoop on your laptop.

Store the document file to $INPUT_PATH;
Run mvn install;
Run hadoop jar $PATH_TO_UoG-BD-MR-1.0-SNAPSHOT.jar MapReduce.DataPreprocess $INPUT_DOC_PATH $PREPROCESS_OUTDIR
Then run hadoop jar $PATH_TO_UoG-BD-MR-1.0-SNAPSHOT.jar MapReduce.PageRanking $PREPROCESS_OUTDIR $PAGERANKING_OUTDIR
If everything w orks fine, the result will be stored in $OUTPUT_DIR

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.idea		.idea
.settings		.settings
src/main		src/main
target		target
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big Data Coursework

Instruction

About

Releases

Packages

Contributors 3

Languages

SuprajaKalva/big_data_assessed_exercise

Folders and files

Latest commit

History

Repository files navigation

Big Data Coursework

Instruction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages