*
is a wildcard. It matches zero or more characters, so *.pdb
matches ethane.pdb
, propane.pdb
, and so on. On the other hand, p*.pdb
only matches pentane.pdb
and propane.pdb
, because the ‘p’ at the front only matches itself.
@@ -117,19 +117,20 @@
Wildcards
Here’s what happens when we run wc -l *.pdb > lengths.txt
. The shell starts by telling the computer to create a new process to run the wc
program. Since we’ve provided some filenames as parameters, wc
reads from them instead of from standard input. And since we’ve used >
to redirect output to a file, the shell connects the process’s standard output to that file.
If we run wc -l *.pdb | sort -n
instead, the shell creates two processes (one for each process in the pipe) so that wc
and sort
run simultaneously. The standard output of wc
is fed directly to the standard input of sort
; since there’s no redirection with >
, sort
’s output goes to the screen. And if we run wc -l *.pdb | sort -n | head -1
, we get three processes with data flowing from the files, through wc
to sort
, and from sort
through head
to the screen.
This simple idea is why Unix has been so successful. Instead of creating enormous programs that try to do many different things, Unix programmers focus on creating lots of simple tools that each do one job well, and that work well with each other. This programming model is called “pipes and filters”. We’ve already seen pipes; a filter is a program like wc
or sort
that transforms a stream of input into a stream of output. Almost all of the standard Unix tools can work this way: unless told to do otherwise, they read from standard input, do something with what they’ve read, and write to standard output.
The key is that any program that reads lines of text from standard input and writes lines of text to standard output can be combined with every other program that behaves this way as well. You can and should write your programs this way so that you and other people can put those programs into pipes to multiply their power.
-
Nelle’s Pipeline: Checking Files
+
Nelle’s Pipeline: Checking Files
Nelle has run her samples through the assay machines and created 1520 files in the north-pacific-gyre/2012-07-03
directory described earlier. As a quick sanity check, starting from her home directory, Nelle types:
$ cd north-pacific-gyre/2012-07-03
$ wc -l *.txt
@@ -161,7 +162,7 @@
Nelle’s Pipeline: Checking Files
Sure enough, when she checks the log on her laptop, there’s no depth recorded for either of those samples. Since it’s too late to get the information any other way, she must exclude those two files from her analysis. She could just delete them using rm
, but there are actually some analyses she might do later where depth doesn’t matter, so instead, she’ll just be careful later on to select files using the wildcard expression *[AB].txt
. As always, the ‘*’ matches any number of characters; the expression [AB]
matches either an ‘A’ or a ‘B’, so this matches all the valid data files she has.
-
What does sort -n
do?
+What does sort -n
do?
If we run sort
on this file:
@@ -187,7 +188,7 @@
What does sort -n
-
What does <
mean?
+What does <
mean?
What is the difference between:
@@ -198,7 +199,7 @@
What does <
-
What does >>
mean?
+What does >>
mean?
What is the difference between:
@@ -210,7 +211,7 @@
What does >>
-
Piping commands together
+Piping commands together
In our current directory, we want to find the 3 files which have the least number of lines. Which command listed below would work?
@@ -224,7 +225,7 @@
Piping commands together
-
Why does uniq
only remove adjacent duplicates?
+Why does uniq
only remove adjacent duplicates?
The command uniq
removes adjacent duplicated lines from its input. For example, if a file salmon.txt
contains:
@@ -244,7 +245,7 @@
Why does uniq
o
-
Pipe reading comprehension
+Pipe reading comprehension
A file called animals.txt
contains the following data:
@@ -262,7 +263,7 @@
Pipe reading comprehension
-
Pipe construction
+Pipe construction
The command:
diff --git a/04-loop.html b/04-loop.html
index 893bb5d02..71e564c53 100644
--- a/04-loop.html
+++ b/04-loop.html
@@ -30,7 +30,7 @@
The Unix Shell
Loops
-
Learning Objectives
+Learning Objectives