forked from aritako/cs132-g36
-
Notifications
You must be signed in to change notification settings - Fork 0
/
data_exploration.html
220 lines (212 loc) · 8.77 KB
/
data_exploration.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
<!DOCTYPE HTML>
<!--
Miniport by HTML5 UP
html5up.net | @ajlkn
Free for personal and commercial use under the CCA 3.0 license (html5up.net/license)
-->
<html>
<head>
<title>PH Twitter Fake News Analysis</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no" />
<link rel="stylesheet" href="assets/css/main.css" />
<link rel="icon" type="image/x-icon" href="images/CS132-LOGO.png">
</head>
<style type="text/css">
/* Set width to 600px, and center box */
.gist {
margin-left: auto;
margin-right: auto;
width: 100% !important;
height: 500px;
}
/* Limit height and width of script box, and enable scrollbars */
.gist-data {
height:250px;
overflow-y: visible;
width: 100%;
height: 500px;
overflow-x: visible;
}
</style>
<body class="is-preload">
<!-- Nav -->
<nav id="nav">
<ul class="container">
<li><a href="index.html#methods">Back To Main</a></li>
</ul>
</nav>
<!-- Home -->
<article id="top" class="wrapper style1">
<div class="container">
<div class="row">
<h1><strong>Data Exploration</strong></h1>
</div>
<div class="row">
<p>As to how we analyzed our data, we primarily used <code>pandas</code>,
a well-known Python library for data analysis. We will also be using
<span style = "color:orange;font-weight: 600;">Google Colab</span> to visualize
how we utilized <code>pandas</code> in our code.
</p>
</div>
</div>
</article>
<!-- Overview -->
<article id="overview" class="wrapper style2" >
<div class="container">
<footer>
<p>Through the data science pipeline, we've harnessed the power to probe through
<a href = "https://twitter.com/home" style = "text-decoration: none; font-weight: 600;">Twitter</a>
and get the information we need.
</p>
<a href="https://docs.google.com/spreadsheets/d/1MHcIRTNmKmb6DngndRu0EYCh4d4RrkaeGGLXeOc9lug/edit?usp=sharing" class="button large scrolly">Let's take a look at our data</a>
<br><br>
</footer>
<div class="row aln-center">
<script src="https://gist.github.com/aritako/6c756e6d8200087c8fc91480f33b26e3.js" ></script>
</div>
<br><br><br>
<div class="row aln-center">
<p>If the Colab Notebook does not appear, click <a href = "https://colab.research.google.com/gist/aritako/6c756e6d8200087c8fc91480f33b26e3/group-36-data-exploration.ipynb" style = "text-decoration: none; font-weight: 600;">here.</a></p>
</div>
<div class="row aln-center">
<img src = 'images/bar_graph.jpg'>
</div>
<br>
<div class="row aln-center">
<img src = 'images/line_graph.JPG'>
</div>
</div>
</article>
<!--
<article id="overview" class="wrapper style2">
<header>
<h2>Preprocessing</h2>
<p>Before we can derive valuable information from our collected data,
we must preprocess our data to make it suitable for analysis.
Note that these expound upon the decisions we made in explorating the data
on our Colab notebook.
</p>
</header>
<div class="row aln-center">
<div class="col-4 col-6-medium col-12-small" id ="prob-form-box">
<section class="box style1" style = "height:100%">
<h3>Null Columns</h3>
<h3 class = "prob-form">
First, we identified the columns which contained empty cells (shown in the output). Then, we decided to drop these columns since we will not be able to obtain any notable data from them.
</h3>
<br>
<h3>Imputing Null Values</h3>
<h3 class = "prob-form">
We first identified which columns our missing values belonged to. After discovering that they belonged to the “Location”, “Rating”, and “Remarks” columns, we decided to fill out the empty cells in our “Location” and “Remarks” with “N/A” values since it is possible to not have values for these columns. Meanwhile, for missing values in the “Rating” column, since there are only 2 of them, we manually went back to these samples and filled in these cells with their appropriate values.
</h3>
</section>
</div>
</div>
<div class="row aln-center">
<div class="col-4 col-6-medium col-12-small" id ="prob-form-box">
<section class="box style1" style = "height:100%">
<h3>Hypothesis</h3>
<h3 class = "prob-form">
The dis/misinformative tweets were <span class = "gold">most likely to claim the strong government opposition against communism</span> during the FEM regime,
which led to the supposed “Golden era.”
</h3>
</section>
</div>
</div>
<div class="row aln-center">
<div class="col-4 col-6-medium col-12-small" id ="prob-form-box">
<section class="box style1" style = "height:100%">
<h3>Null Hypothesis</h3>
<h3 class = "prob-form">
The dis/misinformative tweets were <span class = "gold">equally likely to claim various reasons </span>that led to the supposed “Golden era” during the FEM regime.
</h3>
</section>
</div>
</div>
<div class="row aln-center">
<div class="col-4 col-6-medium col-12-small" id ="prob-form-box">
<section class="box style1" style = "height:100%">
<h3>Solution</h3>
<h3 class = "prob-form">
Collect dis/misinformative tweets, identify each tweet's main reason for their claim, and tally and rank their reasons according to frequency.
</h3>
</section>
</div>
</div>
</article>
-->
<!-- Team -->
<article id="team" class="wrapper style4">
<div class="container medium">
<header>
<h2>We'd like to hear from you.</h2>
<p>You can add more information about the team members here.</p>
</header>
<div class="row">
<div class="col-12">
<form method="post" action="#">
<div class="row">
<div class="col-6 col-12-small">
<input type="text" name="name" id="name" placeholder="Name" />
</div>
<div class="col-6 col-12-small">
<input type="text" name="email" id="email" placeholder="Email" />
</div>
<div class="col-12">
<input type="text" name="subject" id="subject" placeholder="Subject" />
</div>
<div class="col-12">
<textarea name="message" id="message" placeholder="Message"></textarea>
</div>
<div class="col-12">
<ul class="actions">
<li><input type="submit" value="Send Message" /></li>
<li><input type="reset" value="Clear Form" class="alt" /></li>
</ul>
</div>
</div>
</form>
</div>
<div class="col-12">
<hr />
<h3>Find me on ...</h3>
<ul class="social">
<li><a href="#" class="icon brands fa-twitter"><span class="label">Twitter</span></a></li>
<li><a href="#" class="icon brands fa-facebook-f"><span class="label">Facebook</span></a></li>
<li><a href="#" class="icon brands fa-dribbble"><span class="label">Dribbble</span></a></li>
<li><a href="#" class="icon brands fa-linkedin-in"><span class="label">LinkedIn</span></a></li>
<li><a href="#" class="icon brands fa-tumblr"><span class="label">Tumblr</span></a></li>
<li><a href="#" class="icon brands fa-google-plus"><span class="label">Google+</span></a></li>
<li><a href="#" class="icon brands fa-github"><span class="label">Github</span></a></li>
<!--
<li><a href="#" class="icon solid fa-rss"><span>RSS</span></a></li>
<li><a href="#" class="icon brands fa-instagram"><span>Instagram</span></a></li>
<li><a href="#" class="icon brands fa-foursquare"><span>Foursquare</span></a></li>
<li><a href="#" class="icon brands fa-skype"><span>Skype</span></a></li>
<li><a href="#" class="icon brands fa-soundcloud"><span>Soundcloud</span></a></li>
<li><a href="#" class="icon brands fa-youtube"><span>YouTube</span></a></li>
<li><a href="#" class="icon brands fa-blogger"><span>Blogger</span></a></li>
<li><a href="#" class="icon brands fa-flickr"><span>Flickr</span></a></li>
<li><a href="#" class="icon brands fa-vimeo"><span>Vimeo</span></a></li>
-->
</ul>
<hr />
</div>
</div>
<footer>
<ul id="copyright">
<li>© Untitled. All rights reserved.</li><li>Design: <a href="http://html5up.net">HTML5 UP</a></li>
</ul>
</footer>
</div>
</article>
<!-- Scripts -->
<script src="assets/js/jquery.min.js"></script>
<script src="assets/js/jquery.scrolly.min.js"></script>
<script src="assets/js/browser.min.js"></script>
<script src="assets/js/breakpoints.min.js"></script>
<script src="assets/js/util.js"></script>
<script src="assets/js/main.js"></script>
</body>
</html>