Analysing the Boston Data for Airbnb. The blog post is here https://medium.com/@aj455/boston-airbnb-data-analysis-43c38b1cfae5
This project was created as part of Udacity's Data Scientist nanodegree. Here I have analyzed Boston Airbnb Open Data following CRISP-DM methodology. Airbnb data for other cities have the same format. So the same understandings and code can be applied to Airbnb dataset of any other city.
The business questions which I have tried to answer in this project are as follows:
- Most common price listings for AirBnb?
- What is the relation between price and property type?
- Which room types in each neighbourhood have high prices?
- What are the top 5 amenities?
Kaggle Boston AirBnb data https://www.kaggle.com/airbnb/boston. The following Airbnb activity is included in this Boston dataset:
- Listings, including full descriptions and average review score
- Reviews, including unique id for each reviewer and detailed comments
- Calendar, including listing id and the price and availability for that day
- Numpy
- Pandas
- Matplotlib pyplot
- Seaborn
- Sklearn
Python3, Jupyter Notebook
- The most common price listings are in the 50-200 USD range with the highest price being 4000 dollars
- Property types like bread and breakfast are the cheapest and room types which are shared are generally cheapest
- Entire homes/ apartments are the most costly in each neighbourhood while South boston Waterfront is the most expensive neighburhood at 306 dollars.
- Top 5 amenities are Wireless Internet, Heating, kitchen, Essentials, Smoke detector
Using Google Collab, but can also be run using jupypter notebook command.
Thanks to Kaggle and AirBnb for the dataset and Udacity for the course.