-
Notifications
You must be signed in to change notification settings - Fork 0
/
frogs_final_project_example.Rmd
64 lines (45 loc) · 2.31 KB
/
frogs_final_project_example.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
title: "Final Project Example"
output: html_document
date: "2024-06-11"
---
```{r}
library(tidyverse)
```
## Final Project Example
- Think about what kind of data you’d like to work with
- Pick a topic you are interested in! This will help you with motivation and you will also have more background knowledge to start with!
- E.g. Davon’s chess analysis
- [Look for datasets that are about this topic](https://datatrail-jhu.github.io/open-datasets/)
- In each dataset, see what variables are there and what questions you might ask about them. Reference [Practice in Stats chapter](https://datatrail-jhu.github.io/DataTrail/in-practice-using-stats.html) to guide you through initial explorations. Starting with tables and distributions is a great idea
- If the data is too difficult to work with, don’t hesitate to look for a different one!
- Go to your [final_project instructions](https://datatrail-jhu.github.io/DataTrail/github-and-final-data-project.html) to start to dig into the dataset you have of interest and follow the instructions there.
- Message instructors for help when you are feeling stuck!!!
## Frog dataset example
We found a dataset on Kaggle by searching for frogs.
This dataset is from Kaggle from here:
https://www.kaggle.com/datasets/ishandutta/amphibians-data-set?resource=download
We practiced importing these data.
```{r}
frogs_df <- read_delim("data/dataset.csv", delim = ";",
escape_double = FALSE, trim_ws = TRUE,
skip = 1)
```
## Question Ideas
Browse the information about the dataset and the variables available to you to start brainstorming question ideas!
We discussed question ideas for this data set
- Vegetation type and does it interact with what species hangout there? Do different species hangout in differeng vegetation areas?
- Is there species differences for Type of water resevior?
- which highway has more frogs
- how are frog populations affected by presense of fishing?
- frog size vs environment
- color vs relationships?
## Start off with table and distributions to explore!
```{r}
frogs_df %>%
dplyr::select(VR, `Green frogs`, `Brown frogs`, `Common toad`) %>%
knitr::kable()
```
## Next steps:
Now each person in class should think of two topics they are interested in analyzing.
For one of those topics, find two dataset potentials. We are here to help you search!