forked from Karveandhan/rental-price
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path05-results.Rmd
305 lines (239 loc) · 12.6 KB
/
05-results.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
# Results
## Top 5 Cities with high avg rental prices (2020)
```{r, fig.width=20, fig.height=8}
library(dplyr)
library(tidyr)
library(ggplot2)
data <- read.csv("data.csv",header = TRUE, sep=",")
data$mean_2020 <- rowMeans(data[,c('X2020.01','X2020.02','X2020.03','X2020.04','X2020.05','X2020.06','X2020.07','X2020.08','X2020.09','X2020.10','X2020.11','X2020.12')], na.rm=TRUE)
data %>%
filter(mean_2020 >= 2400L & mean_2020 <= 3040L) %>%
ggplot() +
aes(x = RegionName, weight = mean_2020) +
geom_bar(fill = "#112446") +
labs(
x = "Cities",
y = "Rental price",
title = "Top 5 Cities with high avg rental prices (2020)"
) +
theme_minimal(base_size = 30)
```
## Top 5 Cities with high avg rental prices (2021)
```{r, fig.width=20, fig.height=8}
data$mean_2021 <- rowMeans(data[,c('X2021.01','X2021.02','X2021.03','X2021.04','X2021.05','X2021.06','X2021.07','X2021.08','X2021.09','X2021.10')], na.rm=TRUE)
data %>%
filter(mean_2021 >= 2499L) %>%
ggplot() +
aes(x = RegionName, weight = mean_2021) +
geom_bar(fill = "#112446") +
labs(
x = "Cities",
y = "Rental price",
title = "Top 5 Cities with high avg rental prices (2021)"
)+
theme_minimal(base_size = 30)
```
When both plots are being compared, we observe that the top 5 rental price of both the years have been the same city. While we observe
only a marginal increase or decrease in Boston, New York, San Francisco, and San Jose, we observe a substantial increase in the rental price of the city Ventura, CA
## Histogram of NY Rental prices
```{r, fig.width=20, fig.height=8}
ny_data<-data %>%
filter(RegionName=='New York, NY')
ny_data<-as.data.frame(t(ny_data))
ny_data<-as.data.frame(ny_data[-c(1, 2, 3),])
ny_data<-as.data.frame(as.numeric(unlist(ny_data)))
ggplot(ny_data) +
aes(x = `as.numeric(unlist(ny_data))`) +
geom_histogram(bins = 20L, fill = "#112446") +
labs(x = "Rental Price", y = "Frequency", title = "Histogram of NY Rental prices (using monthwise mean)") +
theme_minimal(base_size = 30)
```
The above graph depicts the spread of the NY Rental prices data. It is clear that most of the data lies between the price 2600 and 2750, which tells us that the NY rental market is not at all cheap, though there are few exeptions.
## The change in NY Rental Prices for the same months (Comparing Year 2019 with 2020)
```{r, fig.width=20, fig.height=8}
ny_data1<-data %>%
filter(RegionName=='New York, NY')
ny_data1<-as.data.frame(t(ny_data1))
ny_data1<-as.data.frame(ny_data1[c(64:75),])
ny_data1<-as.data.frame(as.numeric(unlist(ny_data1)))
names(ny_data1)[names(ny_data1) == 'as.numeric(unlist(ny_data1))'] <- 'Price'
ny_data2<-data %>%
filter(RegionName=='New York, NY')
ny_data2<-as.data.frame(t(ny_data2))
ny_data2<-as.data.frame(ny_data2[c(76:87),])
ny_data2<-as.data.frame(as.numeric(unlist(ny_data2)))
names(ny_data2)[names(ny_data2) == 'as.numeric(unlist(ny_data2))'] <- 'Price'
ny_data3<-ny_data2-ny_data1
monlab=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")
ny_data3$label=factor(monlab, levels = monlab)
ny_data3 <- mutate(ny_data3, condition=if_else(Price>0, "Increase", "Decrease"))
ny_data3 <- mutate(ny_data3, loc="New York")
ggplot(data=ny_data3,aes(x=label,y=Price, fill=condition))+
labs(x = "Months", y = "Price")+
geom_col(position = "identity")+
theme(axis.text.x=element_text(angle=45, hjust=1))+
xlab("Months (Comparing 2019 and 2020)")+
ylab("Price Increase or Decrease in $")
```
While the usual trend remains increasing as years increase, there are some strange results obtained when comparing the year 2019 and 2020. Comparing the change in price for the same month with its previous year, we observe that the rental price have substantially decreased. This trend has never been observed in any of the previous years. The cause of this impact to most likely due to covid19.But during the recovery stages of COVID19 (November, December), we could observe that the reduction in prices have started decreasing.
## The change in NY Rental Prices for the same months (Comparing Year 2020 with 2021)
```{r, fig.width=20, fig.height=8}
ny_data1<-data %>%
filter(RegionName=='New York, NY')
ny_data1<-as.data.frame(t(ny_data1))
ny_data1<-as.data.frame(ny_data1[c(76:85),])
ny_data1<-as.data.frame(as.numeric(unlist(ny_data1)))
names(ny_data1)[names(ny_data1) == 'as.numeric(unlist(ny_data1))'] <- 'Price'
ny_data2<-data %>%
filter(RegionName=='New York, NY')
ny_data2<-as.data.frame(t(ny_data2))
ny_data2<-as.data.frame(ny_data2[c(88:97),])
ny_data2<-as.data.frame(as.numeric(unlist(ny_data2)))
names(ny_data2)[names(ny_data2) == 'as.numeric(unlist(ny_data2))'] <- 'Price'
ny_data3<-ny_data2-ny_data1
monlab=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct")
ny_data3$label=factor(monlab, levels = monlab)
ny_data3 <- mutate(ny_data3, condition=if_else(Price>0, "Increase", "Decrease"))
ny_data3 <- mutate(ny_data3, loc="New York")
ggplot(data=ny_data3,aes(x=label,y=Price, fill=condition))+
labs(x = "Months", y = "Price")+
geom_col(position = "identity")+
theme(axis.text.x=element_text(angle=45, hjust=1))+
xlab("Months (Comparing 2020 and 2021)")+
ylab("Price Increase or Decrease in $")
```
When compared to the year 2020, we see the rental prices have started increasing. While the pandemic has made a huge negative impact on rental prices in the year 2020, it is clearly visible that the prices are marching back to normal and the trend seems to be carried for every next month as well. So, New York is marching towards the pre-pandemic normal.
## Comparing the impact of covid on rental pricing within top 5 cities (New York and Boston)
```{r, fig.width=20, fig.height=8}
bo_data1<-data %>%
filter(RegionName=='Boston, MA')
bo_data1<-as.data.frame(t(bo_data1))
bo_data1<-as.data.frame(bo_data1[c(64:75),])
bo_data1<-as.data.frame(as.numeric(unlist(bo_data1)))
names(bo_data1)[names(bo_data1) == 'as.numeric(unlist(bo_data1))'] <- 'Price'
bo_data2<-data %>%
filter(RegionName=='Boston, MA')
bo_data2<-as.data.frame(t(bo_data2))
bo_data2<-as.data.frame(bo_data2[c(76:87),])
bo_data2<-as.data.frame(as.numeric(unlist(bo_data2)))
names(bo_data2)[names(bo_data2) == 'as.numeric(unlist(bo_data2))'] <- 'Price'
bo_dat3<-bo_data2-bo_data1
monlab=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")
bo_dat3$label=factor(monlab, levels = monlab)
bo_dat3 <- mutate(bo_dat3, condition=if_else(Price>0, "Increase", "Decrease"))
bo_dat3 <- mutate(bo_dat3, loc="Boston")
ny_data1<-data %>%
filter(RegionName=='New York, NY')
ny_data1<-as.data.frame(t(ny_data1))
ny_data1<-as.data.frame(ny_data1[c(64:75),])
ny_data1<-as.data.frame(as.numeric(unlist(ny_data1)))
names(ny_data1)[names(ny_data1) == 'as.numeric(unlist(ny_data1))'] <- 'Price'
ny_data2<-data %>%
filter(RegionName=='New York, NY')
ny_data2<-as.data.frame(t(ny_data2))
ny_data2<-as.data.frame(ny_data2[c(76:87),])
ny_data2<-as.data.frame(as.numeric(unlist(ny_data2)))
names(ny_data2)[names(ny_data2) == 'as.numeric(unlist(ny_data2))'] <- 'Price'
ny_data3<-ny_data2-ny_data1
monlab=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")
ny_data3$label=factor(monlab, levels = monlab)
ny_data3 <- mutate(ny_data3, condition=if_else(Price>0, "Increase", "Decrease"))
ny_data3 <- mutate(ny_data3, loc="New York")
ggplot()+
labs(x = "Months", y = "Price")+
geom_col(data=bo_dat3,aes(x=label,y=Price, fill=loc),position = "identity",alpha=0.8)+
geom_col(data=ny_data3,aes(x=label,y=Price, fill=loc),position = "identity",alpha=0.5)+
theme(axis.text.x=element_text(angle=45, hjust=1))+
xlab("Months (Comparing 2019 and 2020)")+
ylab("Price Increase or Decrease in $")
```
```{r, fig.width=20, fig.height=8}
bo_data1<-data %>%
filter(RegionName=='Boston, MA')
bo_data1<-as.data.frame(t(bo_data1))
bo_data1<-as.data.frame(bo_data1[c(76:85),])
bo_data1<-as.data.frame(as.numeric(unlist(bo_data1)))
names(bo_data1)[names(bo_data1) == 'as.numeric(unlist(bo_data1))'] <- 'Price'
bo_data2<-data %>%
filter(RegionName=='Boston, MA')
bo_data2<-as.data.frame(t(bo_data2))
bo_data2<-as.data.frame(bo_data2[c(88:97),])
bo_data2<-as.data.frame(as.numeric(unlist(bo_data2)))
names(bo_data2)[names(bo_data2) == 'as.numeric(unlist(bo_data2))'] <- 'Price'
bo_dat3<-bo_data2-bo_data1
monlab=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct")
bo_dat3$label=factor(monlab, levels = monlab)
bo_dat3 <- mutate(bo_dat3, condition=if_else(Price>0, "Increase", "Decrease"))
bo_dat3 <- mutate(bo_dat3, loc="Boston")
ny_data1<-data %>%
filter(RegionName=='New York, NY')
ny_data1<-as.data.frame(t(ny_data1))
ny_data1<-as.data.frame(ny_data1[c(76:85),])
ny_data1<-as.data.frame(as.numeric(unlist(ny_data1)))
names(ny_data1)[names(ny_data1) == 'as.numeric(unlist(ny_data1))'] <- 'Price'
ny_data2<-data %>%
filter(RegionName=='New York, NY')
ny_data2<-as.data.frame(t(ny_data2))
ny_data2<-as.data.frame(ny_data2[c(88:97),])
ny_data2<-as.data.frame(as.numeric(unlist(ny_data2)))
names(ny_data2)[names(ny_data2) == 'as.numeric(unlist(ny_data2))'] <- 'Price'
ny_data3<-ny_data2-ny_data1
monlab=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct")
ny_data3$label=factor(monlab, levels = monlab)
ny_data3 <- mutate(ny_data3, condition=if_else(Price>0, "Increase", "Decrease"))
ny_data3 <- mutate(ny_data3, loc="New York")
ggplot()+
labs(x = "Months", y = "Price")+
geom_col(data=bo_dat3,aes(x=label,y=Price, fill=loc),position = "identity",alpha=0.8)+
geom_col(data=ny_data3,aes(x=label,y=Price, fill=loc),position = "identity",alpha=0.5)+
theme(axis.text.x=element_text(angle=45, hjust=1))+
xlab("Months (Comparing 2020 and 2021)")+
ylab("Price Increase or Decrease in $")
```
We observe that the due to covid, there has been a drop in rental pricing for Boston as is for New York. But in New York, there has been a tremendous decrease rapidly when compared to Boston. The decrease began for Boston during May 2020, while it started to decrease in New York by March 2020. And the decrease value for New York is almost more than twice than the drop at Boston. During the road to recovery,Boston's rental price stated increasing by May 2021, while for New York it started by June 2021. Though New York took time to reach this point, now the rental price have increased beyond Boston during the month of October 2021. So, New York has almost reached back to it's pre-covid situation.
## NY Rental Price (2014 - 2021)
```{r, fig.width=20, fig.height=8}
ny_data0<-data %>%
filter(RegionName=='New York, NY')
ny_data0<-as.data.frame(t(ny_data0))
library(tibble)
ny_data0 <- tibble::rownames_to_column(ny_data0, "VALUE")
ny_data0<-as.data.frame(ny_data0[-c(1, 2, 3),])
ny_data0<-head(ny_data0,-2)
ny_data0$V1<-as.numeric(unlist(ny_data0$V1))
library(ggplot2)
ggplot(ny_data0) +
aes(x = VALUE, y = V1) +
geom_boxplot(shape = "circle", fill = "#112446") +
labs(x = "2014 to 2021",
y = "Rental Price", title = "NY Rental Price (2014 - 2021)") +
theme_minimal()
```
From the above graph, we can clearly see the drop of the rental prices in New York during the pandemic time. And it is also clear that it is again taking a correction and coping up with the old prices.
## Pre-Pandemic Analysis($ Inc/Dec YoY for all the months):
```{r}
ny_data1<-data %>%
filter(RegionName=='New York, NY')
ny_data1<-as.data.frame(t(ny_data1))
ny_data1<-as.data.frame(ny_data1[c(52:63),])
ny_data1<-as.data.frame(as.numeric(unlist(ny_data1)))
names(ny_data1)[names(ny_data1) == 'as.numeric(unlist(ny_data1))'] <- 'Price'
ny_data2<-data %>%
filter(RegionName=='New York, NY')
ny_data2<-as.data.frame(t(ny_data2))
ny_data2<-as.data.frame(ny_data2[c(64:75),])
ny_data2<-as.data.frame(as.numeric(unlist(ny_data2)))
names(ny_data2)[names(ny_data2) == 'as.numeric(unlist(ny_data2))'] <- 'Price'
ny_data3<-ny_data2-ny_data1
monlab=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")
ny_data3$label=factor(monlab, levels = monlab)
ny_data3 <- mutate(ny_data3, condition=if_else(Price>0, "Increase", "Decrease"))
ny_data3 <- mutate(ny_data3, loc="New York")
ggplot(data=ny_data3,aes(x=label,y=Price, fill=condition))+
labs(x = "Months", y = "Price")+
geom_col(position = "identity")+
theme(axis.text.x=element_text(angle=45, hjust=1))+
xlab("Months (Comparing 2018 and 2019)")+
ylab("Price Increase or Decrease in $")
```
When we compared the prices of New York apartment in 2018 and 2019, it is very clear that there has never been a fall in price for the same month in the consecutive year before the pandemic. And it is the pandemic that has caused the rental prices at New York to fall down during the year 2020. Also, the peak time during which the rental prices hike up is predominant during July (Fall) and the increase in prices within a year are less when it reaches winter (December).