Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
Edouard-Legoupil committed Nov 13, 2023
1 parent 7509420 commit 34866c2
Show file tree
Hide file tree
Showing 19 changed files with 2,663 additions and 100 deletions.
61 changes: 60 additions & 1 deletion docs/learn/01.Reproducibility.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -421,6 +421,61 @@ Gradual automation
.img[![Eat the cake](img/Group.png)]
]

---

## The Reproducible Research Manisfesto

1. For every result, **keep track** of how it was produced

2. **Avoid manual data manipulation** steps

3. **Archive** the exact versions of all external programs used

4. **Version control** all custom scripts

5. **Record all intermediate results**, when possible in standardized formats

6. For analyses that include randomness, **note underlying random seeds**

7. Always **store raw data** behind plots

8. Generate hierarchical analysis output, allowing layers of increasing detail to be inspected

9. Connect **textual statements** to underlying results

10. Provide **public access** to scripts, runs, and results

---

## From "click" to "script"

.pull-left[

Using the right combination of packages, you can integrate all necessary data analysis steps into **scripts**:


* Data management (clean, recode, merge, reshape)

* Data analysis (test, regression, multivariate analysis, etc...)

* Data visualisation (plot, map, graph...)

* Writing up results (report and presentation generation)
]

.pull-right[

![](img/analysis.png)

More on reproducible analysis [here](https://ropensci.github.io/reproducibility-guide/sections/introduction/)

]

---

## Save time and minimize space for errors...

![](img/exhausted.png)

---
class: inverse, center, middle
Expand All @@ -434,5 +489,9 @@ class: inverse, center, middle

[post Feedback here](https://github.com/unhcRverse/unhcrverse/issues/new?assignees=&labels=enhancement&projects=&template=comment_prex_1_reproducibility.md&title=%5Blearn%5D)


---

## Let install R & Rstudio...



275 changes: 272 additions & 3 deletions docs/learn/01.Reproducibility.html

Large diffs are not rendered by default.

117 changes: 117 additions & 0 deletions docs/learn/02.Tidyverse.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,101 @@ https://towardsdatascience.com/the-stages-of-learning-data-science-3cc8be181f54

See Video - https://www.youtube.com/watch?v=hpMc6TgT34I

---

## Essential Concept: Objects & Data elements

- `Vectors` are a core data structure in R, and are created with `c()`. Elements in a vector must be of the same type.

- `Data.frame` where each column is a vector, but adjacent vectors can hold different things

- `Matrix` just like a data frame except it's all numeric

- `List` are made of any dimension, mix and match

- `Factors` are a special class that R uses for categorical variables, which also allows for value labeling and ordering.

Reference link on [Manipulating data](http://www.cookbook-r.com/Manipulating_data/)


---

## Vector Example

```{r eval=FALSE}
numbers = c(23, 13, 5, 7, 31)
names = c("mohammed", "hussein", "ali")
#Elements are indexed starting at 1, and are accessed with `[]` notation.
numbers[1] # 23
names[1] # mohammed
```

---

## PART 2: Data frames

[Data frames](http://www.r-tutor.com/r-introduction/data-frame)

```{r eval=FALSE}
books = data.frame(
title = c("harry potter", "war and peace", "lord of the rings"),
author = c("rowling", "tolstoy", "tolkien"),
num_pages = c("350", "875", "500")
)
# you can access columns of a data frame with `$`.
books$title # c("harry potter", "war and peace", "lord of the rings")
books$author[1] # "rowling"
#You can also create new columns with `$`.
books$num_bought_today = c(10, 5, 8)
books$num_bought_yesterday = c(18, 13, 20)
books$total_num_bought = books$num_bought_today + books$num_bought_yesterday
```

---

## Load a Data Frame

``` {r eval=FALSE}
data(CASchools)
mydata <- CASchools
# load a data set from csv and assign it to an object called 'mydata'
#mydata <- read.csv("unhcr_mass_comm_db_merged_20140612.csv")
# first few rows of the dataset
head(mydata)
# last few rows
tail(mydata)
# variable names
colnames(mydata)
# pop-up view of entire data set (uncomment to run)
# View(mydata)
```

---

## Initial Exploration

```{r eval=FALSE}
# dimension of the data frame
dim(mydata)
# Structure of the data frame of all variables
# this includes the class(type) i.e factor or not
str(mydata)
# summary statistics with means for every variable
summary(mydata)
```


---
class: inverse, center, middle
Expand Down Expand Up @@ -377,6 +472,18 @@ class: inverse, center, middle
You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
What does it take to create a graphic? Data, axis, geometric objects, etc.


## Decomposition of a Graphic

The **"grammar of graphics"** is a [conceptual description](https://ramnathv.github.io/pycon2014-r/visualize/ggplot2.html) of all potential graphs. It can be summarized as follows:

```
- plot ::= coord scale+ facet? layer+
- layer ::= data mapping stat geom position?
```
![](img/ggplot2-anatomy-all-annotated.png)

---

## Structure of `ggplot2`
Expand Down Expand Up @@ -1387,4 +1494,14 @@ class: inverse, center, middle
- [A ggplot2 tutorial for beautiful plotting in R](https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/) and [ggplot Wizardry Hands-On](https://z3tt.github.io/OutlierConf2021/) by Cedric Scherer
- Ggplot workshop [Part1](https://www.youtube.com/watch?v=h29g21z0a68)/[Part2](https://www.youtube.com/watch?v=0m4yywqNPVY) by Thomas Lin Pedersen (one of the main maintainer of ggplot)


---

# Excercise

Check the last #tidytuesday submission on Refugee

https://twitter.com/search?q=%20%23TidyTuesday%20refugees&src=typed_query

Select a chart and try to reproduce it!!

497 changes: 461 additions & 36 deletions docs/learn/02.Tidyverse.html

Large diffs are not rendered by default.

63 changes: 52 additions & 11 deletions docs/learn/03.Functions.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ Collaboration

## Transition from Scripts to Functions


.pull-left[

Script Example

```{r}
Expand All @@ -59,6 +62,11 @@ mean_value <- mean(data)
print(mean_value)
```


]

.pull-right[



Function Example
Expand All @@ -76,15 +84,23 @@ mean_value <- calculate_mean(data)
print(mean_value)
```

]

---

## Key Differences


.pull-left[

* Functions encapsulate code into reusable units.
* Parameters allow customization of function behavior.

Creating Functions in R

]

.pull-right[

Function Structure

```{r}
Expand All @@ -95,11 +111,27 @@ function_name <- function(parameter1, parameter2, ...) {
return(result)
}
```
]

---

## Example: Custom Function


.pull-left[

1. Takes an input

2. Apply a transformation

3. Generates an output


]

.pull-right[


```{r}
# Custom function to calculate the median of a numeric vector
Expand All @@ -108,7 +140,7 @@ calculate_median <- function(data) {
return(median_value)
}
```

]

---

Expand All @@ -124,20 +156,23 @@ print(median_value)

---

## Import Data
## Import Data


- Data import: [`readr`](https://readr.tidyverse.org/), [`readxl`](https://readxl.tidyverse.org/), etc.
from Excel - [`readxl`](https://readxl.tidyverse.org/),


---
from Text: [`readr`](https://readr.tidyverse.org/), etc.


## Start with Documentation
---

Why Document Functions?
## Why Document Functions?

* Documentation provides information on how to use a function.

* Helps other users (including your future self) understand the purpose and usage of the function.

* Standard practice for sharing code.

R Documentation
Expand Down Expand Up @@ -259,9 +294,9 @@ Example with Custom Parameters

---

## Conclusion: Key Takeaways
## Key Takeaways

* Functions are essential for code reusability and modularity.
* Functions are essential for code re-usability and modularity.

* Transition from scripts to functions for better code organization.

Expand All @@ -288,7 +323,7 @@ class: inverse, center, middle

---

## Reference:
# Reference

- [Build Advanced Charts in R](https://wd3.myworkday.com/unhcr/learning/course/efee8410b2b410017044140f1aeb0001?type=9882927d138b100019b928e75843018d)

Expand All @@ -300,4 +335,10 @@ class: inverse, center, middle

- [R for Data Science](https://r4ds.had.co.nz/),

- [Advanced R Programming](https://adv-r.hadley.nz/)
- [Advanced R Programming](https://adv-r.hadley.nz/)

---

# Excercise

Use the char your previously created and turn it into a function with parameters
365 changes: 355 additions & 10 deletions docs/learn/03.Functions.html

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/learn/04.SurveyToolbox.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -285,10 +285,10 @@ App Maturity: __beta-version__

---

## How to finalise and maintain such Toolbox
## How to finalise and maintain such Toolbox...
.pull-left[

The regular Standard Multi-Tier IT Standard Support model...
Applying Standard Multi-Tier IT Standard Support model...

* __Tier 4__: Code Review & Quality Insurance (_Contracted Company with frame agreement_)

Expand Down
22 changes: 19 additions & 3 deletions docs/learn/04.SurveyToolbox.html

Large diffs are not rendered by default.

Loading

0 comments on commit 34866c2

Please sign in to comment.