This project involves an in-depth analysis of Airbnb listings in New York City. The dataset contains various attributes such as listing ID, name, host ID, host name, neighborhood group, neighborhood, latitude, longitude, room type, price, minimum nights, number of reviews, last review, and reviews per month. The main objective is to uncover insights related to revenue generation by neighborhood and room type.
- Python
- Pandas
- Matplotlib
- Seaborn
- Loaded the Airbnb dataset into a Pandas DataFrame.
- Explored the dataset to understand its structure and the type of data it contains.
- Calculated the number of Airbnb listings in each of the five Neighborhood Groups (Manhattan, Brooklyn, Queens, Bronx, Staten Island).
- Computed the percentage distribution of listings across the Neighborhood Groups.
- Created a new field named "Revenue," calculated as the product of the "Price" and "Number_Of_Reviews" columns.
- Plotted a bar chart to show which Neighborhood Group has the highest average revenues.
- Filtered the dataset to include only listings from Manhattan, Brooklyn, and Queens.
- Identified the top 3 revenue-generating neighborhoods in these three Neighborhood Groups.
- Further filtered the dataset to include only the top 3 neighborhoods in each of the three main Neighborhood Groups.
- Identified the top average revenue-generating room type in each of these neighborhoods and visualized this using a bar chart.
- Manhattan has the highest number of Airbnb listings and also generates the highest average revenue.
- Entire homes/apartments are generally the top revenue-generating room types across most neighborhoods.
- Among the neighborhoods analyzed, Williamsburg in Brooklyn and Harlem in Manhattan are the top revenue generators.
This analysis can be extended to include time-series data to understand seasonal trends, as well as incorporating additional data such as property amenities, host ratings, and guest reviews for a more comprehensive analysis.
Christine Baxter
Data for this project was sourced from Airbnb listings in New York City.
The project leveraged Git and GitHub for version control, ensuring a systematic and collaborative approach to code development. The version control strategy is outlined below:
- Initialization: Created an initial
develop
branch as the base for development. - Feature Branches: For each major stage of the project (Initial Setup, Data Loading, Exploratory Data Analysis), a dedicated feature branch was created off of the
develop
branch.
- Commit Changes: After the completion of each stage, all changes were committed to the respective feature branch, encapsulating the progress in a version-controlled manner.
- Code Reviews and Merges: The feature branch was then merged into the
develop
branch post-review. This ensures that thedevelop
branch always contains the most recent, stable version of the project. - Creating New Feature Branches: A new feature branch was created from the updated
develop
branch for the next stage of the project. - Final Merge: Upon completion of all stages, the
develop
branch was merged into themain
branch, signifying the conclusion of the project.
- Comprehensive comments were added at each stage for clarity and future reference.
By adhering to this workflow, the project maintained a high level of code integrity, streamlined collaboration, and enabled seamless tracking of project milestones.