-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathrun_analysis.Rmd
89 lines (61 loc) · 2.38 KB
/
run_analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
title: "run_analysis"
author: "shradheya"
date: "3/31/2020"
output: html_document
---
# Starting with loading "dplyr.
```{r}
#Load library
library(dplyr)
```
# Loading data.
```{r}
#Reading train set using read.table function because it is a tabular data.
train_x <- read.table("getdata_projectfiles_UCI HAR Dataset/UCI HAR Dataset/train/X_train.txt")
train_y <- read.table("getdata_projectfiles_UCI HAR Dataset/UCI HAR Dataset/train/y_train.txt")
train_sub <- read.table("getdata_projectfiles_UCI HAR Dataset/UCI HAR Dataset/train/subject_train.txt")
#Reading test set using read.table function because it is a tabular data.
test_x <- read.table("getdata_projectfiles_UCI HAR Dataset/UCI HAR Dataset/test/X_test.txt")
test_y <- read.table("getdata_projectfiles_UCI HAR Dataset/UCI HAR Dataset/test/y_test.txt")
test_sub <- read.table("getdata_projectfiles_UCI HAR Dataset/UCI HAR Dataset/test/subject_test.txt")
# Reading activity lables
active_lable <- read.table("getdata_projectfiles_UCI HAR Dataset/UCI HAR Dataset/activity_labels.txt")
# Reading features
features <- read.table("getdata_projectfiles_UCI HAR Dataset/UCI HAR Dataset/features.txt")
```
# Merging train, test and, subject data [train_x with test_x], [train_y with test_y] and [train_sub with test_sub].
```{r}
# Merging x
merge_x <- rbind(train_x, test_x)
# Merging y
merge_y <- rbind(train_y, test_y)
# Merging subjects
merge_sub <- rbind(train_sub, test_sub)
```
# Selecting all the "mean" and "std" from "features"
```{r}
features_selected <- features[grep("mean|std",features[,2]),]
merge_x <- merge_x[,features_selected[,1]]
```
# Nameing activities in the merged data by descriptive activity name.
```{r}
colnames(merge_y) <- "label"
merge_y$activity <- factor(merge_y$label, labels = as.character(active_lable[,2]))
activity <- merge_y$activity
```
# Labeling the data with descriptive variable names.
```{r}
colnames(merge_x) <- features[features_selected[,1],2]
```
# From the data set in above step, creates a second, independent tidy data set with the average of each variable for each activity and each subject.
```{r}
colnames(merge_sub) <- "subject"
combine <- cbind(merge_x, activity, merge_sub)
temp <- group_by(combine,activity, subject)
final <- summarize_all(temp,funs(mean))
```
# Writing tidy data "tidy_data.txt"
```{r}
write.table(final, file = "./tidy_data.txt", row.names = FALSE, col.names = TRUE)
```