-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathinstructors.html
110 lines (110 loc) · 9.17 KB
/
instructors.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<title>RNAseq Using R Programming with R</title>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
<link rel="stylesheet" type="text/css" href="css/swc.css" />
<meta charset="UTF-8" />
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body class="lesson">
<div class="container card">
<div class="banner" align="right" style="margin-top: 5px">
<a href="http://monash.edu/bioinformatics" title="Monash Bioinformatics Platform">
<div style="margin-right: 5px;" class="label swc-blue-bg">Monash Bioinformatics Platform</div>
</a>
</div>
<article>
<div class="row">
<div class="col-md-10 col-md-offset-1">
<a href="index.html"><h1 class="title">Programming with R</h1></a>
<h2 class="subtitle">Instructor’s Guide</h2>
<h2 id="legend">Legend</h2>
<p>We are using a dataset with records on inflammation from patients following an arthritis treatment. With it we explain <code>R</code> data structure, basic data manipulation and plotting, writing functions and loops.</p>
<h2 id="overall">Overall</h2>
<p>This lesson is written as an introduction to R, but its real purpose is to introduce the single most important idea in programming: how to solve problems by building functions, each of which can fit in a programmer’s working memory. In order to teach that, we must teach people a little about the mechanics of manipulating data with lists and file I/O so that their functions can do things they actually care about. Our teaching order tries to show practical uses of every idea as soon as it is introduced; instructors should resist the temptation to explain the “other 90%” of the language as well.</p>
<p>The secondary goal of this lesson is to give them a usable mental model of how programs run (what computer science educators call a <a href="reference.html#notional-machine">notional machine</a> so that they can debug things when they go wrong. In particular, they must understand how function call stacks work.</p>
<p>The final example asks them to build a command-line tool that works with the Unix pipe-and-filter model. We do this because it is a useful skill and because it helps learners see that the software they use isn’t magical. Tools like <code>grep</code> might be more sophisticated than the programs our learners can write at this point in their careers, but it’s crucial they realize this is a difference of scale rather than kind.</p>
<p>The <code>R</code> novice inflammation contains a lot of material to cover. Remember this lesson does not spend a lot of time on data types, data structure, etc. It is also on par with the similar lesson on Python. The objective is to explain modular programming with the concepts of functions, loops, flow control, and defensive programming (i.e. SWC best practices). Supplementary material is available for R specifics (<a href="01-supp-addressing-data.html">Addressing Data</a>, <a href="01-supp-data-structures.html">Data Types and Structure</a>, <a href="01-supp-factors.html">Understanding Factors</a>, <a href="01-supp-intro-rstudio.html">Introduction to RStudio</a>, <a href="01-supp-read-write-csv.html">Reading and Writing .csv</a>, <a href="03-supp-loops-in-depth.html">Loops in R</a>, <a href="06-best-practices-R.html">Best Practices for Using R and Designing Programs</a>, <a href="07-knitr-R.html">Dynamic Reports with knitr</a>, <a href="08-making-packages-R.html">Making Packages in R</a>).</p>
<p>A typical, half-day, lesson would use the first three lessons:</p>
<ol style="list-style-type: decimal">
<li><a href="01-starting-with-data.html">Analyzing Patient Data</a></li>
<li><a href="02-func-R.html">Creating Functions</a></li>
<li><a href="03-loops-R.html">Analyzing Multiple Data Sets</a></li>
</ol>
<p>An additional half-day could add the next two lessons:</p>
<ol start="4" style="list-style-type: decimal">
<li><a href="04-cond.html">Making choices</a></li>
<li><a href="05-cmdline.html">Command-Line Programs</a></li>
</ol>
<p>Time-permitting, you can fit in one of these shorter lessons that cover bigger picture ideas like best practices for organizing code, reproducible research, and creating packages:</p>
<ol start="6" style="list-style-type: decimal">
<li><a href="06-best-practices-R.html">Best practices for using R and designing programs</a></li>
<li><a href="07-knitr-R.html">Dynamic reports with knitr</a></li>
<li><a href="08-making-packages-R.html">Making packages in R</a></li>
</ol>
<h2 id="analyzing-patient-data"><a href="01-starting-with-data.html">Analyzing Patient Data</a></h2>
<ul>
<li><p>Check learners are reading files from the correct location (set working directory); remind them of the shell lesson</p></li>
<li><p>Provide shortcut for the assignment operator (<code><-</code>) (RStudio: Alt+- on Windows/Linux; Option+- on Mac)</p></li>
</ul>
<pre class="sourceCode r"><code class="sourceCode r">dat <-<span class="st"> </span><span class="kw">read.csv</span>(<span class="st">"data/inflammation-01.csv"</span>, <span class="dt">header =</span> <span class="ot">FALSE</span>)</code></pre>
<pre class="output"><code>Warning in file(file, "rt"): cannot open file 'data/inflammation-01.csv':
No such file or directory
</code></pre>
<pre class="output"><code>Error in file(file, "rt"): cannot open the connection
</code></pre>
<pre class="sourceCode r"><code class="sourceCode r">animal <-<span class="st"> </span><span class="kw">c</span>(<span class="st">"m"</span>, <span class="st">"o"</span>, <span class="st">"n"</span>, <span class="st">"k"</span>, <span class="st">"e"</span>, <span class="st">"y"</span>)
<span class="co"># Challenge - Slicing (subsetting data)</span>
animal[<span class="dv">4</span>:<span class="dv">1</span>] <span class="co"># first 4 characters in reverse order</span></code></pre>
<pre class="output"><code>[1] "k" "n" "o" "m"
</code></pre>
<pre class="sourceCode r"><code class="sourceCode r">animal[-<span class="dv">1</span>] <span class="co"># remove first character</span></code></pre>
<pre class="output"><code>[1] "o" "n" "k" "e" "y"
</code></pre>
<pre class="sourceCode r"><code class="sourceCode r">animal[-<span class="dv">4</span>] <span class="co"># remove fourth character</span></code></pre>
<pre class="output"><code>[1] "m" "o" "n" "e" "y"
</code></pre>
<pre class="sourceCode r"><code class="sourceCode r">animal[-<span class="dv">1</span>:-<span class="dv">4</span>] <span class="co"># remove first to fourth characters</span></code></pre>
<pre class="output"><code>[1] "e" "y"
</code></pre>
<pre class="sourceCode r"><code class="sourceCode r">animal[<span class="kw">c</span>(<span class="dv">5</span>, <span class="dv">2</span>, <span class="dv">3</span>)] <span class="co"># new character vector</span></code></pre>
<pre class="output"><code>[1] "e" "o" "n"
</code></pre>
<pre class="sourceCode r"><code class="sourceCode r"><span class="co"># Challenge - Subsetting data</span>
<span class="kw">max</span>(dat[<span class="dv">5</span>, <span class="dv">3</span>:<span class="dv">7</span>])</code></pre>
<pre class="output"><code>Error in eval(expr, envir, enclos): object 'dat' not found
</code></pre>
<pre class="sourceCode r"><code class="sourceCode r">sd_day_inflammation <-<span class="st"> </span><span class="kw">apply</span>(dat, <span class="dv">2</span>, sd)
<span class="kw">plot</span>(sd_day_inflammation)</code></pre>
<h2 id="addressing-data"><a href="01-supp-addressing-data.html">Addressing Data</a></h2>
<ul>
<li>Note that the data frame <code>dat</code> is not the same set of data as in other lessons</li>
</ul>
<h2 id="data-types-and-structure"><a href="01-supp-data-structures.html">Data Types and Structure</a></h2>
<ul>
<li>Lesson on data types and structures</li>
</ul>
<h2 id="understanding-factors"><a href="01-supp-factors.html">Understanding Factors</a></h2>
<h2 id="introduction-to-rstudio"><a href="01-supp-intro-rstudio.html">Introduction to RStudio</a></h2>
<h2 id="reading-and-writing-.csv"><a href="01-supp-read-write-csv.html">Reading and Writing .csv</a></h2>
</div>
</div>
</article>
<div class="footer">
<a class="label swc-blue-bg" href="https://github.com/MonashBioinformaticsPlatform/RNAseq-DE-analysis-with-R">Source</a>
</div>
</div>
<!-- Javascript placed at the end of the document so the pages load faster -->
<script src="http://code.jquery.com/jquery-1.9.1.js"></script>
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
</body>
</html>