Skip to content

In this project we have taken the IPL data from the years 2008 to 2016 and have applied the concepts of Hadoop and implemented in hive

License

Notifications You must be signed in to change notification settings

nandhakumarss/IPL-Data-Analysis-Using-HIVE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

IPL-Data-Analysis-Using-HIVE

In this project we have taken the IPL data from the years 2008 to 2016 and have applied the concepts of Hadoop and implemented in hive

IPL DATA ANALYSIS USING HIVE

ABSTRACT

The Indian Premier League (IPL), also known as TATA IPL for sponsorship reasons, is a men's T20 franchise cricket league of India. It is annually contested by ten teams based out of seven Indian cities and three Indian states. The league was founded by the Board of Control for Cricket in India (BCCI) in
2007. Brijesh Patel is the incumbent chairman of IPL. It is usually held annually in summer across India between March to May and has an exclusive window in the ICC Future Tours Programme. The IPL is the most-attended cricket league in the world and in 2014 was ranked sixth by average attendance among all sports leagues. In 2010, the IPL became the first sporting event in the world to be broadcast live on YouTube. Over the course of its run starting from its inaugural season in 2008 till the recently concluded one in 2022, there have been various winners with the franchises of Chennai and Mumbai winning the title multiple times. In this project we have taken the IPL data from the years 2008 to 2016 and have applied the concepts of Hadoop and implemented in hive and pig.

ABOUT THE DATA

Two cricket data files with Indian Premier League data from 2008 to 2016 is used as a data source. The files are as follows: 1.matches.csv – Provides details about each match played 2.deliveries.csv – Provides details about consolidated deliveries of all the matches

These files are extracted and loaded into Hive. The data is further processed, transformed, and analyzed to get the winner for each season and the top 5 batsmen with maximum run in each season and overall season.

AIM:

To find the information of certain players and to find the winner of the IPL editions from 2008 to 2016 on the two given datasets and analyse them using hive in Hadoop.

About

In this project we have taken the IPL data from the years 2008 to 2016 and have applied the concepts of Hadoop and implemented in hive

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published