ls
, mv
, cp
, rm
(-rf *
), cat
, chmod
/chgrp
/chown
and
everyone's favorite, touch
.
"I'm in the phone book! I'm somebody now!" - Navin Johnson (The Jerk)
Typically in Linux we are scripting and otherwise moving around files. The file system under the covers may be one of any number of supported formats, including:
Each has its strengths and weaknesses. While Linux tends to treat the ext* file systems as preferred, it can write to a lot of file systems and can read even more.
As mentioned before, the biggest differences between Linux and Windows is that the Linux
environments do not have a concept of "drive letters." Instead everything is "mounted" under a
single hierarchy that starts at the "root directory" or /
:
\drios{Windows}
\index{*@\texttt{/} (root directory)}\index{files and directories!root (\texttt{/})}
\drcap{Listing of the root directory}
~ $ ls /
bin dev home lib64 mnt Other run sys var
boot Docs initrd.img lost+found Music proc sbin tmp vmlinuz
cdrom etc lib media opt root srv usr
The root file system may be backed by a disk device, memory or even the network. It will have one or more directories under it. Multiple physical drives and network locations can be "mounted" virtually anywhere, under any directory or subdirectory in the hierarchy.
Note: Dynamically mounted devices like USB drives and DVDs are often mounted automatically
under either a /mnt
or /media
directory.
As we've already seen, the command to list the contents of a directory is
ls
:
\drfnd{ls}{list directory contents}
\drcap{Listing directory contents}
~ $ ls
Desktop Documents Downloads Music Pictures Public Templates Videos
Remember, "UNIX" environments think of files that start with a .
as "hidden." If you want to see
all these
"dotfiles",
you can use ls -a
, in this case on an average "home" directory:
\index{files and directories!hidden (dotfiles)}
\drcap{Listing a home directory showing hidden "dotfiles"}
~ $ ls -a
. .config .gconf .mozilla Templates
.. .dbus .gnome2 Music Videos
.bash_history Desktop .gnome2_private Pictures .xsession-errors
.bash_logout .dmrc .ICEauthority .profile
.cache Documents .linuxmint Public
.cinnamon Downloads .local .ssh
Wow! That's a lot of dotfiles!
If you want to see some details for each file, use ls -l
:
\drcap{Detailed listing of home directory}
~ $ ls -l
total 32
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Desktop
drwxr-xr-x 3 myuser mygroup 4096 Dec 13 18:22 Documents
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Downloads
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Music
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Pictures
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Public
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Templates
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Videos
And of course parameters can be combined, as with the two above:
\drcap{Detailed listing of home directory with "dotfiles"}
~ $ ls -al
total 112
drwxr-xr-x 21 myuser mygroup 4096 Dec 13 18:19 .
drwxr-xr-x 6 root root 4096 Dec 13 14:24 ..
-rw------- 1 myuser mygroup 287 Dec 13 18:19 .bash_history
-rw-r--r-- 1 myuser mygroup 220 Dec 13 14:24 .bash_logout
drwx------ 5 myuser mygroup 4096 Dec 13 18:18 .cache
drwxr-xr-x 3 myuser mygroup 4096 Dec 13 18:18 .cinnamon
drwxr-xr-x 12 myuser mygroup 4096 Dec 13 18:18 .config
drwx------ 3 myuser mygroup 4096 Dec 13 18:18 .dbus
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Desktop
-rw------- 1 myuser mygroup 29 Dec 13 18:18 .dmrc
drwxr-xr-x 3 myuser mygroup 4096 Dec 13 18:22 Documents
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Downloads
drwx------ 3 myuser mygroup 4096 Dec 13 18:18 .gconf
drwx------ 3 myuser mygroup 4096 Dec 13 18:18 .gnome2
drwx------ 2 myuser mygroup 4096 Dec 13 18:18 .gnome2_private
-rw------- 1 myuser mygroup 668 Dec 13 18:18 .ICEauthority
drwxr-xr-x 3 myuser mygroup 4096 Dec 13 18:18 .linuxmint
drwxr-xr-x 3 myuser mygroup 4096 Dec 13 18:18 .local
drwxr-xr-x 4 myuser mygroup 4096 Dec 13 18:18 .mozilla
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Music
drwxr-xr-x 2 myuser mygroup 4096 Dec 13 18:18 Pictures
...and so on...
In bash
and many Linux commands in general, there are old, "short" (terse) parameter names, like
ls -a
, and newer, longer, descriptive parameter names like ls --all
that mean the same thing.
It is typically good to use the shorter version during interactive sessions and testing, but I
prefer long parameter names in scripts, because when I come back and look at it in two years, I may
not remember what rm -rf *
means (in the "UNIX" world it means you're toast if you run it by
mistake), thus rm --recursive --force *
seems a bit more "intuitive."
\drfnd{rm}{remove file}
The behind you save in the future by describing things well today may well be your own. - me
The older style parameters are typically preceded by a single hyphen or "switch" character:
\drcap{Short parameter}
~ $ ls -r
Some commands support parameters with no "switch" character at all, as with xvf
(eXtract,
Verbose, input File name) in the following tar
example:
\drcmd{tar}{archive files}
\drcap{Alternate short parameter syntax}
~ $ tar xvf backup.tar
The newer "GNU-style" parameters are preceded by two hyphens and usually are quite "verbose":
\drcap{Long parameters}
~ $ ls --recursive --almost-all --ignore-backups
Again, it is highly recommended that you take the time to use the GNU-style parameters in scripts as self-documenting code.
If we suspect the file is a text file, we can echo it to the console with the
cat
(concatenate) command:
\drtxt{cat}
\index{files and directories!display (\texttt{cat} command)}
\drcap{\texttt{cat} command}
~ $ cat installrdp
#!/bin/bash
sudo apt-get -y install git
cd ~
git clone git://github.com/FreeRDP/FreeRDP.git
cd FreeRDP
sudo apt-get -y install build-essential git-core cmake libssl-dev libx11-dev lib
xext-dev libxinerama-dev \
libxcursor-dev libxdamage-dev libxv-dev libxkbfile-dev libasound2-dev libcups2
-dev libxml2 libxml2-dev \
libxrandr-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev libxi-dev
libgstreamer-plugins-base1.0-dev
sudo apt-get -y install libavutil-dev libavcodec-dev
sudo apt-get -y install libcunit1-dev libdirectfb-dev xmlto doxygen libxtst-dev
cmake -DCMAKE_BUILD_TYPE=Debug -DWITH_SSE2=ON .
make
sudo make install
sudo echo "/usr/local/lib/freerdp" > /etc/ld.so.conf.d/freerdp.conf
sudo echo "/usr/local/lib64/freerdp" >> /etc/ld.so.conf.d/freerdp.conf
sudo echo "/usr/local/lib" >> /etc/ld.so.conf.d/freerdp.conf
sudo ldconfig
which xfreerdp
xfreerdp --version
In this example when we cat installrdp
we can determine it is a bash
shell script (because the
"shebang" is pointing to bash
\drshl{bash}) that looks to install and configure
FreeRDP on a Debian-style system:
\drnet{xfreerdp}
\index{*@\texttt{#"!} (shebang)}
-
apt-get
- Debian-style package manager. \drpkg{apt-get} -
git clone
- cloning package from GitHub. \drcmd{git}{distributed version control} -
cmake
andmake
- configuring and building software from source. \drcmd{cmake}{configure makefiles} \drcmd{make}{build by recipes}
A better way to display a longer file is to use the less
command (which is a derivative of the original more
, hence the
name). less
is a paginator, where the Space
, Page Down
or down arrow keys scroll down and the
Page Up
or up arrow keys scrolls up. Q
quits.
\drtxt{less}
\drtxt{more}
\index{files and directories!paginate!less@\texttt{less} command}
\index{files and directories!paginate!more@\texttt{more} command}
\index{pagination!\texttt{less} command}
\index{pagination!\texttt{more} command}
Note: The vi
search (/
, ?
, n
and p
) and navigation (G
, 0
) keys work within less
,
too. In general less
is a great lightweight way to motor around in a text file without editing it.
\dreds{vi}
We can also look at just the end or tail of a file (often the most interesting when looking at
log files and troubleshooting a current problem) with the tail
command. The next example shows the last 10 lines of the kernel dmesg
log:
\drtxt{tail}
\index{files and directories!display (\texttt{tail} command)}
\index{files and directories!special!varlogdmesg@\texttt{/var/log/dmesg}}
\index{varlogdmesg@\texttt{/var/log/dmesg} (kernel log)}
\drcap{\texttt{tail} command}
/var/log $ tail dmesg
[ 3.913318] Bluetooth: BNEP socket layer initialized
[ 3.914888] Bluetooth: RFCOMM TTY layer initialized
[ 3.914895] Bluetooth: RFCOMM socket layer initialized
[ 3.914900] Bluetooth: RFCOMM ver 1.11
[ 3.935772] init: failsafe main process (732) killed by TERM signal
[ 4.046700] init: cups main process (896) killed by HUP signal
[ 4.046710] init: cups main process ended, respawning
[ 4.186239] init: samba-ad-dc main process (919) terminated with status 1
[ 4.328999] r8169 0000:02:00.0 eth0: link down
[ 4.329037] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
To show a specific number of lines use the -n
parameter with tail
:
\drcap{Display last 15 lines of a file with \texttt{tail -n}}
/var/log $ tail -n 15 dmesg
[ 3.899169] Bluetooth: HCI socket layer initialized
[ 3.899170] Bluetooth: L2CAP socket layer initialized
[ 3.899179] Bluetooth: SCO socket layer initialized
[ 3.913306] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 3.913309] Bluetooth: BNEP filters: protocol multicast
[ 3.913318] Bluetooth: BNEP socket layer initialized
[ 3.914888] Bluetooth: RFCOMM TTY layer initialized
[ 3.914895] Bluetooth: RFCOMM socket layer initialized
[ 3.914900] Bluetooth: RFCOMM ver 1.11
[ 3.935772] init: failsafe main process (732) killed by TERM signal
[ 4.046700] init: cups main process (896) killed by HUP signal
[ 4.046710] init: cups main process ended, respawning
[ 4.186239] init: samba-ad-dc main process (919) terminated with status 1
[ 4.328999] r8169 0000:02:00.0 eth0: link down
[ 4.329037] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
You can also use tail
to with the -f
parameter to follow an open file and continuously display
any new output at the end, which is useful for monitoring log files in real time:
\drcap{\texttt{tail -f} command}
/var/log $ tail -f syslog
Dec 13 19:23:40 MtLindsey dhclient: DHCPACK of 192.168.0.8 from 192.168.0.1
Dec 13 19:23:40 MtLindsey dhclient: bound to 192.168.0.8 -- renewal in 1423 seco
nds.
Dec 13 19:23:40 MtLindsey NetworkManager[960]: <info> (eth0): DHCPv4 state chang
ed renew -> renew
Dec 13 19:23:40 MtLindsey NetworkManager[960]: <info> address 192.168.0.8
Dec 13 19:23:40 MtLindsey NetworkManager[960]: <info> prefix 24 (255.255.255.0
)
Dec 13 19:23:40 MtLindsey NetworkManager[960]: <info> gateway 192.168.0.1
Dec 13 19:23:40 MtLindsey NetworkManager[960]: <info> nameserver '97.64.168.12
'
Dec 13 19:23:40 MtLindsey NetworkManager[960]: <info> nameserver '192.119.194.
131'
Dec 13 19:23:40 MtLindsey dbus[689]: [system] Activating service name='org.freed
esktop.nm_dispatcher' (using servicehelper)
Dec 13 19:23:40 MtLindsey dbus[689]: [system] Successfully activated service 'or
g.freedesktop.nm_dispatcher'
Use Ctrl-C
to cancel following the file.
If we know nothing about a file, we can use the file
command
to help us guess:
\drfnd{file}{detect file type}
\drcap{\texttt{file} command}
~ $ file installrdp
installrdp: Bourne-Again shell script, ASCII text executable
That's straightforward enough! The file
command isn't always 100% accurate, but it is pretty good
and uses an interesting set of heuristics and a text file "database" of "magic" number
definitions to define how it figures out what type of file it is
examining.
Remember: File extensions have no real meaning per se in Linux (although some are used
especially for media and document formats), so a file name with no extension like installrdp
is
perfectly valid. Hence the utility of the file
command.
Let's say we have three files, and want to display the contents of one of them. We know we can do
that with cat
:
\drtxt{cat}
\drcap{Show contents of one file}
~ $ cd Invoices/
~/Invoices $ ls
ElevatorTrucks FarmCombines FarmTractors
~/Invoices $ cat ElevatorTrucks
Truck brakes 200
Truck tires 400
Truck tires 400
Truck tires 400
Truck winch 100
But what if we wanted to process all the lines in all the files in alphabetical order? Just directing the files into a program won't do it, because the file names will be sorted by the shell and the lines will be processed in file name order, not the ultimate sorted order of all the file contents.
\drcap{Show contents of all three files}
~/Invoices $ cat *
Truck brakes 200
Truck tires 400
Truck tires 400
Truck tires 400
Truck winch 100
Combine motor 1500
Combine brakes 400
Combine tires 2500
Tractor motor 1000
Tractor brakes 300
Tractor tires 2000
The sort
command to the rescue! We will see that the sort
command can be used to not just sort files, but also to merge them and remove duplicates.
\drtxt{sort}
\index{sorting!\texttt{sort} command}
\drcap{\texttt{sort} command}
~/Invoices $ sort *
Combine brakes 400
Combine motor 1500
Combine tires 2500
Tractor brakes 300
Tractor motor 1000
Tractor tires 2000
Truck brakes 200
Truck tires 400
Truck tires 400
Truck tires 400
Truck winch 100
What if we want to sort by the parts column? Well, it is the second "key" field delimited by whitespace, so:
\drcap{Sort by the second "key" column}
~/Invoices $ sort -k 2 *
Truck brakes 200
Tractor brakes 300
Combine brakes 400
Tractor motor 1000
Combine motor 1500
Tractor tires 2000
Combine tires 2500
Truck tires 400
Truck tires 400
Truck tires 400
Truck winch 100
What about by the third column, the amount?
\drcap{Sort by the third column}
~/Invoices $ sort -k 3 *
Truck winch 100
Tractor motor 1000
Combine motor 1500
Truck brakes 200
Tractor tires 2000
Combine tires 2500
Tractor brakes 300
Combine brakes 400
Truck tires 400
Truck tires 400
Truck tires 400
That's not what we expected because it is sorting numbers alphabetically. Let's fix that by telling it to sort numerically:
\drcap{Sort by third column, numerically}
~/Invoices $ sort -k 3 -n *
Truck winch 100
Truck brakes 200
Tractor brakes 300
Combine brakes 400
Truck tires 400
Truck tires 400
Truck tires 400
Tractor motor 1000
Combine motor 1500
Tractor tires 2000
Combine tires 2500
Maybe we care about the top three most expensive items. We haven't talked about pipes
yet, but
check this out:
\index{*@\texttt{"|} (pipe)}\index{files and directories!redirection}
\drcap{Top three most expensive items}
~ $ sort -k 3 -n * | tail -n 3
Combine motor 1500
Tractor tires 2000
Combine tires 2500
Finally, what if we want only unique rows?
\drcap{Sort and show only unique rows}
~/Invoices $ sort -k 3 -n -u *
Truck winch 100
Truck brakes 200
Tractor brakes 300
Truck tires 400
Tractor motor 1000
Combine motor 1500
Tractor tires 2000
Combine tires 2500
Just to reinforce long parameters, the last example is equivalent to:
\drcap{Sort unique rows using long parameter names}
~/Invoices $ sort --key 3 --numeric-sort --unique *
Truck winch 100
Truck brakes 200
Tractor brakes 300
Truck tires 400
Tractor motor 1000
Combine motor 1500
Tractor tires 2000
Combine tires 2500
If you read that command in a script file, there would be little confusion as to what it was doing.
We can copy, move (or rename - same thing) and delete files and directories. To copy, simply use
the cp
command:
\drfnd{cp}{copy}
\drcap{\texttt{cp} command}
~ $ cp diary.txt diary.bak
You can copy entire directories recursively:
\drcap{Copying directories recursively}
~ $ cp -r thisdir thatdir
Or, if we want to be self-documenting in a script, we can use those long parameter names again:
\drcap{\texttt{cp} command with long parameter names}
~ $ cp --recursive thisdir thatdir
To move use mv
:
\drfnd{mv}{move files}
\drcap{\texttt{mv} command}
~ $ mv thismonth.log lastmonth.log
Note: There is no semantic difference between "move" and "rename." However, there are some
really cool renaming scenarios that the rename
command can
take care of beyond mv
, like renaming all file extensions from .htm
to .html
.
\drfnd{rename}{rename file}
To delete or remove a file you use rm
:
\drfnd{rm}{remove file}
\drcap{\texttt{rm} command}
~ $ rm desktop.ini
Pro Tip: There is no "Are you sure?" prompt when removing a single file specified with no wildcards, or even all files with a wildcard, and there is no "Recycle Bin" or "Trash Can" when working from the command prompt, so BE CAREFUL!
The following scenario can happen way too often, even to experienced system administrators.
Note the accidental space between *
and .bak
on the rm
command:
\drcap{Oops!}
~ $ cd MyDissertation/
~/MyDissertation $ ls
Bibliography.bak Bibliography.doc Dissertation.bak Dissertation.doc
~/MyDissertation $ rm * .bak
rm: cannot remove ‘.bak’: No such file or directory
~/MyDissertation $ ls
~/MyDissertation $
So, in order, our hapless user:
-
Changed to directory
MyDissertation
. -
Listed the directory contents with
ls
, saw the combination of.doc
and.bak
files. -
Decided to delete the
.bak
files withrm
, but accidentally typed in a space between the wildcard*
and the.bak
. Note ominous warning message. -
Presto!
ls
shows everything is gone, not just the backup files! The user's priorities just got rearranged as they go hunting for another copy of their dissertation.
So be careful out there! This is an example where tab completion can be an extra error check. Many
times I use command history in these cases by changing the ls
to look for just the files I want
to delete:
\drcap{First make sure we are dealing with the right files}
~ $ ls *.bak
Citations.bak Dissertation.bak
Then I use the "up arrow" to bring back the ls
command and change ls
to rm
before running it.
Safer that way.
We just learned how to make a file disappear. We can also make a file magically appear, just by
touch
:
\drfnd{touch}{change modified date or create file}
\drcap{\texttt{touch} command}
~ $ touch NewEmptyDissertation.doc
~ $ ls -l
total 0
-rw-rwxr--+ 1 myuser mygroup 0 Oct 19 14:12 NewEmptyDissertation.doc
Notice the newly created file is zero bytes long.
Interestingly enough, we can also use touch
just to update the "last modified date" of an
existing file, as you can see in time change in the following listing after running touch
on the
same file again:
\drfnd{touch}{change modified date or create file}
\drcap{A second \texttt{touch}}
~ $ touch NewEmptyDissertation.doc
~ $ ls -l
total 0
-rw-rwxr--+ 1 myuser mygroup 0 Oct 19 14:14 NewEmptyDissertation.doc
It can be useful (but also distressing from a forensics point of view) to set the last modified
date of a file to a specific date and time, which touch
also allows you to do, in this case to
the night before Christmas:
\drcap{Set file modified date to a specific date and time}
~ $ touch -t 201412242300 NewEmptyDissertation.doc
~ $ ls -l
total 0
-rw-rwxr--+ 1 myuser mygroup 0 Dec 24 2014 NewEmptyDissertation.doc
To make a directory you use mkdir
:
\drfnd{mkdir}{make directory}
\drcap{\texttt{mkdir} command}
~ $ cd Foo
~/Foo $ ls -l
total 4
-rw-r--r-- 1 myuser mygroup 0 Dec 14 05:49 a
-rw-r--r-- 1 myuser mygroup 0 Dec 14 05:49 b
-rw-r--r-- 1 myuser mygroup 0 Dec 14 05:49 c
drwxr-xr-x 2 myuser mygroup 4096 Dec 14 05:49 d
~/Foo $ mkdir Bar
~/Foo $ ls -l
total 8
-rw-r--r-- 1 myuser mygroup 0 Dec 14 05:49 a
-rw-r--r-- 1 myuser mygroup 0 Dec 14 05:49 b
drwxr-xr-x 2 myuser mygroup 4096 Dec 14 14:49 Bar
-rw-r--r-- 1 myuser mygroup 0 Dec 14 05:49 c
drwxr-xr-x 2 myuser mygroup 4096 Dec 14 05:49 d
Typically you need to create all intervening directories before creating a "child" directory:
\drcap{\texttt{mkdir} error}
~ $ mkdir Xyzzy/Something
mkdir: cannot create directory ‘Xyzzy/Something’: No such file or directory
But of course you can override that behavior:
\drcap{Make multiple intervening directories at once}
~/Foo $ mkdir --parents Xyzzy/Something
~/Foo $ ls
a b Bar c d Xyzzy
~/Foo $ ls Xyzzy
Something
Ever notice that "life" is an anagram for "file"? Spooky, eh?
Given that the UNIX-style file systems are hierarchical in nature they are similar to navigate as
with CMD.EXE
. The biggest difference is the absense of drive letters and the direction of the
slashes.
\drshl{CMD.EXE}
To change directories, simply use cd
much like in Windows:
\drfnd{cd}{change directory}
\drcap{\texttt{cd} command}
~ $ cd /etc
~ $ pwd
/etc
pwd
simply prints the working (current) directory. If
whoami
tells you who you are, pwd
tells you where you are.
\drfnd{pwd}{print working directory}
In Linux, users can have "home" directories (similar to Windows profiles), typically located under
/home/<username>
for normal users, and /root
for the "root" (admin) id. To change to a user's
"home" directory, simply use cd
with no parameters:
\drfnd{cd}{change directory}
\index{files and directories!home (\texttt{~{}})}
\drcap{Change to home directory}
/etc $ cd
~ $ pwd
/home/myuser
The tilde (~
) character is an alias for the current user's home directory. The following example
is equivalent to above:
\index{*@\texttt{~{}} (home directory)}
\index{files and directories!home (\texttt{~{}})}
\drcap{Alternative way to change to home directory}
/etc $ cd ~
~ $ pwd
/home/myuser
More useful is that the tilde can be combined with a user name to specify the home directory of another user:
\drcap{Change to the home directory of another user}
~ # cd ~myuser
myuser # pwd
/home/myuser
Note: The above assumes you have permissions to cd
into /home/myuser
. See the upcoming
section on file permissions for more info.
\drfnd{cd}{change directory}
In addition, you need to know the difference between "absolute" and "relative" paths:
-
Absolute path - always "goes through" or specifies the "root" (
/
) directory, e.g. regardless of the current working directory,cd /etc
will change it to/etc
. \index{*@\texttt{/} (root directory)} \index{files and directories!absolute path} \index{files and directories!root (\texttt{/})} -
Relative path - does not specify the root directory, and expects to start the navigation at the current directory with all path components traversed from there, e.g.,
cd Dissertations
changes the current directory to a subdirectory calledDissertations
. \index{files and directories!relative path}
Windows inherited the concept of .
for the current directory and ..
for the parent directory
directly from UNIX. Consider the following examples that combine all of the above about relative
paths and see if it makes sense:
\index{@\texttt{.} (current directory)}
\index{@\texttt{..} (parent directory)}
\index{files and directories!current directory (\texttt{.})}
\index{files and directories!parent directory (\texttt{..})}
\drcap{Relative paths exercise}
~/Foo $ ls
~/Foo $ mkdir Bar Baz
~/Foo $ ls
Bar Baz
~/Foo $ cd Bar
~/Foo/Bar $ touch a b c
~/Foo/Bar $ ls
a b c
~/Foo/Bar $ cd ../Baz
~/Foo/Baz $ touch d e f
~/Foo/Baz $ ls
d e f
~/Foo/Baz $ ls ..
Bar Baz
~/Foo/Baz $ ls ../Bar
a b c
Did you notice how both mkdir
and touch
allow for specifying multiple directory and file names
in the same command?
\drfnd{mkdir}{make directory}
\drfnd{touch}{change modified date or create file}
Most "UNIX" file systems come with a set of nine permissions that can be thought of as a "grid" of 3x3 showing "who has what?" The "who" is known as "UGO": \index{files and directories!permissions}
-
User - the user that is the "owner" of the file or directory.
-
Group - the group that is the "owner" of the file or directory.
-
Other - everyone else.
The "what" is:
-
Read
-
Write
-
Execute - for files, for directories this means "navigate" or "list contents".
The combination of "who has what?" is usually shown in detailed directory listings by a set of ten
characters, with the first one determining whether an entry is a directory (d
) or a file (-
):
\drbsd{FreeBSD}
\drcap{Another \texttt{ls -l} example, this time on FreeBSD}
% ls -l /etc
total 1876
drwxr-xr-x 2 root wheel 512 Jan 15 2009 X11
-rw-r--r-- 1 root wheel 0 Sep 3 2013 aliases
-rw-r--r-- 1 root wheel 16384 Sep 3 2013 aliases.db
-rw-r--r-- 1 root wheel 210 May 6 2009 amd.map
-r--r--r-- 1 root wheel 233 Feb 15 2007 amd.map.snap
-rw-r--r-- 1 root wheel 1234 May 6 2009 apmd.conf
-rw-r--r-- 1 root wheel 231 May 6 2009 auth.conf
drwxr-xr-x 2 root wheel 512 May 6 2009 bluetooth
-rw-r--r-- 1 root wheel 737 Mar 19 2015 crontab
-rw-r--r-- 1 root wheel 108 May 6 2009 csh.cshrc
-rw-r--r-- 1 root wheel 617 Apr 15 2009 csh.login
-rw-r--r-- 1 root wheel 110 May 6 2009 csh.logout
-rw-r--r-- 1 root wheel 565 May 6 2009 ddb.conf
drwxr-xr-x 2 root wheel 512 May 6 2009 defaults
-rw-r--r-- 1 root wheel 9779 May 6 2009 devd.conf
-rw-r--r-- 1 root wheel 2071 May 6 2009 devfs.conf
-rw-r--r-- 1 root wheel 267 May 6 2009 dhclient.conf
-rw-r--r-- 1 root wheel 5704 May 6 2009 disktab
-rw-rw-r-- 1 root operator 0 Nov 3 2005 dumpdates
drwxr-xr-x 6 root staff 512 Nov 12 2014 fail2ban
-rw-r--r-- 1 root wheel 142 May 6 2009 fbtab
...and so on...
In the above, for example, we can see that the user root
owns the file aliases
while the
wheel
group is the primary group for it. root
can both read and write the file (rw-
) while
any user in the wheel
group can only read it (r--
). Any other id will also have read access
(r--
).
Similarly we see that defaults
is a directory (d
) that can be read, written (new files created)
and listed by root
(rwx
), and read and listed by the group wheel
and all other ids (r-xr-x
).
Back on Linux, if we look in /etc/init.d
where many services store their startup scripts we see:
\index{files and directories!special!etcinitd@\texttt{/etc/init.d/}}
\drcap{Listing the \texttt{/etc/init.d} directory}
~ $ ls -l /etc/init.d
total 276
-rwxr-xr-x 1 root root 2243 Apr 3 2014 acpid
-rwxr-xr-x 1 root root 2014 Feb 19 2014 anacron
-rwxr-xr-x 1 root root 4596 Apr 24 2015 apparmor
-rwxr-xr-x 1 root root 2401 Dec 30 2013 avahi-daemon
-rwxr-xr-x 1 root root 1322 Mar 30 2014 binfmt-support
-rwxr-xr-x 1 root root 4474 Sep 4 2014 bluetooth
-rwxr-xr-x 1 root root 2125 Mar 13 2014 brltty
-rwxr-xr-x 1 root root 4651 Apr 9 2014 casper
-rwxr-xr-x 1 root root 425 Jun 26 09:11 cinnamon
-rwxr-xr-x 1 root root 1919 Jan 18 2011 console-setup
-rwxr-xr-x 1 root root 2489 May 6 2012 cpufrequtils
lrwxrwxrwx 1 root root 21 Sep 7 04:00 cron -> /lib/init/upstart-job
-rwxr-xr-x 1 root root 938 Nov 1 2013 cryptdisks
-rwxr-xr-x 1 root root 896 Nov 1 2013 cryptdisks-early
-rwxr-xr-x 1 root root 3184 Apr 3 2014 cups
-rwxr-xr-x 1 root root 1961 Apr 7 2014 cups-browsed
-rwxr-xr-x 1 root root 2813 Nov 25 2014 dbus
-rwxr-xr-x 1 root root 1217 Mar 7 2013 dns-clean
lrwxrwxrwx 1 root root 21 Sep 7 04:00 friendly-recovery -> /lib/init/upstart-
job
-rwxr-xr-x 1 root root 1105 May 13 2015 grub-common
...and so on...
In this case all the scripts are readable, writable and executable (rwx
) by the root
user, and
readable and executable by the root
group and all other users (r-xr-x
). Later on I will explain
linked files (those that start with an l
instead of a -
in the detailed listing above).
To change the owning user of a file or directory (assuming you have permissions to do so), use
the chown
command:
\drprm{chown}
\drcap{Change file ownership}
# ls -l
total 4
-rwxr--r-- 1 root root 17 Oct 20 10:07 foo
# chown git foo
# ls -l
total 4
-rwxr--r-- 1 git root 17 Oct 20 10:07 foo
To change the primary group, use the chgrp
command:
\drprm{chgrp}
\drcap{\texttt{chgrp} command}
# chgrp git foo
# ls -l
total 4
-rwxr--r-- 1 git git 17 Oct 20 10:07 foo
To change the various permissions or mode bits, you use the
chmod
command. It uses mnemonics of "ugo" for user, group and
"other," respectively. It also uses mnemonics of "rwx" for read, write and execute, and +
to add
a permission and -
to remove it. For example, to add the execute permission for the group and
remove read permission for "other":
\drprm{chmod}
\drcap{\texttt{chmod} command}
# chmod g+x,o-r foo
# ls -l
total 4
-rwxr-x--- 1 git git 17 Oct 20 10:07 foo
Pro Tip: To look like an old-hand UNIX hacker, you can also convert any set of "rwx" permissions into an octal number from 0 (no permissions) to 7 (all permissions). It helps to think of the three permissions as "binary places":
r
= 2
^2
^= 4
w
= 2
^1
^= 2
x
= 2
^0
^= 1
-
= 0
Some examples:
---
= 0 + 0 + 0 = 0
r--
= 2
^2
^+ 0 + 0 = 4
r-x
= 2
^2
^+ 0 + 2
^0
^= 5
rw-
= 2
^2
^+ 2
^1
^+ 0 = 6
rwx
= 2
^2
^+ 2
^1
^+ 2
^0
^= 7
Now to use octal with chmod
, we think of the overall result we want for a file. For example, if
we want the foo
file to be readable, writable and executable by both its owning user and group,
and not accessible at all by anyone else, we could use:
\drcap{\texttt{chmod} with lots of typing}
# chmod u+rwx,g+rwx,o- foo
# ls -l
total 4
-rwxrwx--- 1 git git 17 Oct 20 10:07 foo
Or we could simply convert those permissions into octal in our head and:
\drcap{\texttt{chmod} with octal like a boss}
# chmod 770 foo
# ls -l
total 4
-rwxrwx--- 1 git git 17 Oct 20 10:07 foo
Now you know the answer to that "How will we ever use octal in real life?" question you asked in school!
For a script or executable file to be allowed to run, it must be marked as executable for one of the user, group or other entries. The following should be insightful:
\drcap{Marking a file as executable}
# echo "echo Hello world" > foo
# ls -l
total 4
-rw-r--r-- 1 root root 17 Oct 20 10:07 foo
# ./foo
-bash: ./foo: Permission denied
# chmod u+x foo
# ls -l
total 4
-rwxr--r-- 1 root root 17 Oct 20 10:07 foo
# ./foo
Hello world
In the Windows world, we are used to compressing and sending directories around as .zip
files. In
Linux you can also deal with .zip
files, although they don't tend to be the most common, using the
zip
and unzip
commands:
\drcom{unzip}
\drcom{zip}
\index{compression!zipfiles@\texttt{.zip} files}
\index{files and directories!zipfiles@\texttt{.zip} files}
\drcap{\texttt{zip} command}
~ $ cd Foo
~/Foo $ touch a b c
~/Foo $ mkdir d
~/Foo $ touch d/e
~/Foo $ cd ..
~ $ zip -r Foo Foo
adding: Foo/ (stored 0%)
adding: Foo/c (stored 0%)
adding: Foo/b (stored 0%)
adding: Foo/d/ (stored 0%)
adding: Foo/d/e (stored 0%)
adding: Foo/a (stored 0%)
~ $ ls -l Foo.zip
-rw-r--r-- 1 myuser mygroup 854 Dec 14 15:31 Foo.zip
\drcap{\texttt{unzip} command}
~ $ unzip Foo
Archive: Foo.zip
replace Foo/c? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
extracting: Foo/c
extracting: Foo/b
extracting: Foo/d/e
extracting: Foo/a
Not too exciting, but you get the drift. There is typically support for other compression
algorithms, such as the gzip
,
bzip2
and 7z
(7-zip)
commands.
\drcom{7z}
\drcom{bzip2}
\drcom{gzip}
However, the "native" way to "archive" a directory's contents in "UNIX" is with
tar
, which is so old that tar
stands for "tape archive." Its
purpose is to take virtually any directory structure and create a single output "stream" or file of
it. That is then typically ran through a compression command and the result is called a "tarball":
\drcmd{tar}{archive files}
\drcap{Creating a tarball}
~ $ tar cvf Foo.tar Foo/*
Foo/a
Foo/b
Foo/c
Foo/d/
Foo/d/e
~ $ ls -l Foo.tar
-rw-r--r-- 1 myuser mygroup 10240 Dec 19 07:52 Foo.tar
~ $ gzip Foo.tar
~ $ ls -l Foo.tar.gz
-rw-r--r-- 1 myuser mygroup 193 Dec 19 07:52 Foo.tar.gz
In the tar
command above, the parameters are c
(create a new archive), v
(turn on "verbose"
output) and f
followed by the file name of the new .tar
file.
\index{files and directories!tararchivefiles@\texttt{.tar} archive files}
Note: tar
supports POSIX-style parameters (-c
), GNU-style (--create
), as well as the
older style (c
with no hyphens at all), as shown in these examples. So both of the following are
also equivalent to the above:
\drcap{\texttt{tar} parameter styles}
~ $ tar -c -v -f Foo.tar Foo/*
~ $ tar --create --verbose --file=Foo.tar Foo/*
The use of compression commands along with tar
is so prevalent that they've been built into tar
itself now as optional parameters:
\drcap{One-step tarball}
~ $ tar cvzf Foo.tgz Foo
Foo/
Foo/c
Foo/b
Foo/d/
Foo/d/e
Foo/a
~ $ ls -l Foo.tgz
-rw-r--r-- 1 myuser mygroup 197 Dec 19 07:54 Foo.tgz
In this case the z
parameter says to use gzip
compression, and the .tgz
file suffix means
basically "tarred and gzipped", or the equivalent to .tar.gz
in the first example.
\drcom{gzip}
\index{compression!tgzfiles@\texttt{.tgz} files}
\index{files and directories!tgzarchivefiles@\texttt{.tgz} archive files}
tar
is used to both create and read .tar
files. So to extract something like the above, you can
change the create (c
) parameter to extract (x
), like this:
\drcap{Extracting a tarball}
~ $ tar xvf Foo.tgz
Foo/
Foo/c
Foo/b
Foo/d/
Foo/d/e
Foo/a
In Windows there are "shortcuts," which are simply special files that the OS knows to interpret as "go open this other file over there." There are also "hard links" that allow for different directory entries in the same file system to point to the same physical file.
UNIX file systems also have both these types of links (which isn't surprising, given that Microsoft
got the ideas from UNIX). Both are created with the ln
command.
A "soft link" is equivalent to a Windows shortcut, and can point to a file or a directory, and can
point to anything on any mounted file system:
\drfnd{ln}{link}
\index{files and directories!soft links}
\drcap{Soft links example}
~ $ ls -l
total 4
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 a
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 b
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 c
drwxr-xr-x 2 myuser mygroup 4096 Oct 24 16:00 d
~ $ cd d
~ $ pwd
/tmp/foo/d
~ $ cd ..
~ $ ln -s a MyThesis.doc
~ $ ln -s d Dee
~ $ ls -l
total 4
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 a
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 b
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 c
drwxr-xr-x 2 myuser mygroup 4096 Oct 24 16:00 d
lrwxrwxrwx 1 myuser mygroup 1 Oct 24 16:40 Dee -> d
lrwxrwxrwx 1 myuser mygroup 1 Oct 24 16:40 MyThesis.doc -> a
~ $ cd Dee
~ $ pwd
/tmp/foo/Dee
The things to notice about this example:
-
The
-s
parameter indicates "create a soft link." -
Instead of a
-
ord
, a soft link is shown in als
listing asl
regardless of whether the target is a file or directory. This is because a soft link doesn't "know" what the target is - it is just a file in a directory pointing to another location. What that location is will be determined after the link is traversed.
A "hard link" is a bit different. It can only be made between files and the two files must be on the same file system. That is because hard links are actually directory entries (as opposed to files in directories) that point to the same "inode" on disk. From within a single directory it is impossible to tell if there are other directories with pointers to the same files (inodes) on disk. \index{files and directories!hard links} \index{files and directories!inodes}
\drcap{Hard links example}
~ $ ls
a b c d Dee MyThesis.doc
~ $ ln b B
~ $ cd d
~ $ ln ../b .
~ $ ls -l
total 0
-rw-r--r-- 3 myuser mygroup 0 Oct 24 15:53 b
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:54 e
~ $ cd ..
~ $ ls -l
total 4
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 a
-rw-r--r-- 3 myuser mygroup 0 Oct 24 15:53 b
-rw-r--r-- 3 myuser mygroup 0 Oct 24 15:53 B
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 c
drwxr-xr-x 2 myuser mygroup 4096 Oct 24 16:49 d
lrwxrwxrwx 1 myuser mygroup 1 Oct 24 16:40 Dee -> d
lrwxrwxrwx 1 myuser mygroup 1 Oct 24 16:40 MyThesis.doc -> a
The "net net" of all the above is that now b
, B
and d/b
all point to exactly the same inode,
or disk location, i.e., the exact same physical file.
So what can possibly go wrong with links? With soft links the answer is easy - the "remote" location being pointed to goes away or is renamed:
\drcap{Broken soft links example}
~ $ ls -l
total 4
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 a
-rw-r--r-- 3 myuser mygroup 0 Oct 24 15:53 b
-rw-r--r-- 3 myuser mygroup 0 Oct 24 15:53 B
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 c
drwxr-xr-x 2 myuser mygroup 4096 Oct 24 16:49 d
lrwxrwxrwx 1 myuser mygroup 1 Oct 24 16:40 Dee -> d
lrwxrwxrwx 1 myuser mygroup 1 Oct 24 16:40 MyThesis.doc -> a
~ $ rm a
~ $ ls -l
total 4
-rw-r--r-- 3 myuser mygroup 0 Oct 24 15:53 b
-rw-r--r-- 3 myuser mygroup 0 Oct 24 15:53 B
-rw-r--r-- 1 myuser mygroup 0 Oct 24 15:53 c
drwxr-xr-x 2 myuser mygroup 4096 Oct 24 16:49 d
lrwxrwxrwx 1 myuser mygroup 1 Oct 24 16:40 Dee -> d
lrwxrwxrwx 1 myuser mygroup 1 Oct 24 16:40 MyThesis.doc -> a
~ $ cat MyThesis.doc
cat: MyThesis.doc: No such file or directory
So even though the soft link MyThesis.doc
is still in the directory, the actual underlying file
a
is now gone, and trying to access it via the soft link leads to the somewhat confusing "No such
file or directory" error message (splutter "I can see it! It's right there!")
With hard links, it isn't so much a problem because of the nature of the beast. Since each hard link is a directory (metadata) entry pointing to an inode, deleting one simply deletes that directory entry. As long as the file has other hard links pointing to it, it "exists." Only when the last remaining hard link is removed has it been "deleted." Let's play:
\drcap{Many hard links, one inode}
~ $ echo "This is b." > b
~ $ cat b
This is b.
~ $ cat B
This is b.
~ $ cat d/b
This is b.
So, that makes sense. We created an original file b
by placing "This is b." in it, and then
created two hard links to it, B
and d/b
. We see that it has the same contents no matter how we
access it, because it is pointing to the same inode.
Can you guess how many rm
commands it will take to delete the file containing "This is b."?
\drfnd{rm}{remove file}
\drcap{Deleting a file with many hard links}
~ $ rm b
~ $ cat b
cat: b: No such file or directory
~ $ cat B
This is b.
~ $ cat d/b
This is b.
~ $ rm B
~ $ cat d/b
This is b.
~ $ rm d/b
Ultimately, it takes a rm
for every hard link to permanently delete a file.
With all this talk that a hard link can only be on the same file system, how do you know whether
two directories are on the same file system? In Windows it's easy - that's exactly what the drive
letters are telling you. But in Linux, where everything is "mounted" under a single hierarchy
starting at /
, how do I know that /var/something
and var/or/other
are on the same file system?
\index{*@\texttt{/} (root directory)}
\index{files and directories!root (\texttt{/})}
There are multiple ways to tell, actually. The easiest is with the
df
command:
\drfnd{df}{display file system disk space}
\drcap{\texttt{df} command}
~ $ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/mint--vg-root 118647068 28847464 83749608 26% /
none 4 0 4 0% /sys/fs/cgroup
udev 1965068 4 1965064 1% /dev
tmpfs 396216 1568 394648 1% /run
none 5120 0 5120 0% /run/lock
none 1981068 840 1980228 1% /run/shm
none 102400 24 102376 1% /run/user
/dev/sda1 240972 50153 178378 22% /boot
The ones of interest are the /dev
entries, and we see that everything mounted under /
is on one
file system, except for whatever happens to be on the file system mounted under /boot
. So outside
of /boot
, on this system we could hard link away to our heart's content.
df
is a good command to see the disk space utilization of each file system. If you want to see the
space used by a directory and its subdirectories, use du
:
\drfnd{du}{disk use by directory}
\drcap{\texttt{du} command}
/tmp $ du
4 ./icedteaplugin-mdm-EMmQCt
4 ./ssh-IaP0RC1l4XCL
4 ./hsperfdata_mdm
4 ./.ICE-unix
4 ./VSCode Crashes
4 ./.X11-unix
8 ./mintUpdate
4 ./orbit-myuser
4 ./pulse-PKdhtXMmr18n
84 .
The default size unit for du
is 1,024 bytes, but that can be changed. So in the above, /tmp
and
its children are taking 84KB of disk space.
Note: - It is (barely) beyond the scope of this book to cover the
mount
command. I wanted to, really bad, but with all the
different file systems and device types and all the options for both it can get so complex so fast
that I decided not to. Maybe if you ask, real nice...
\drsys{mount}{mount file system}
So, both hard and soft links can have some interesting side effects if you think about them. For one, if you are backing things up, then you may get duplicates in your backup set. In fact, with hard links you will, by definition, unless the backup software is very smart and doing things like de-duplication. \index{files and directories!hard links} \index{files and directories!soft links}
But even with soft links if everything just blindly followed them you could also get duplicates
where you didn't want them, or even circular references. Also, the pointers in the soft link files
are not evaluated until a command references them. Note that the following is perfectly legal with
soft links, but may not give the results you expect - think about the current working directory
shown by pwd
in the following, and the effects of the relative paths as the sample
progresses:
\drfnd{pwd}{print working directory}
\drcap{Soft links and relative paths}
~ $ cd Foo
~/Foo $ rm -rf *
~/Foo $ cd ..
~ $ cd Foo
~/Foo $ pwd
/home/myuser/Foo
~/Foo $ rm -rf *
~/Foo $ mkdir d
~/Foo $ touch a b c d/e
~/Foo $ ln -s . d/f
~/Foo $ ls d/f
e f
~/Foo $ ln -s .. d/g
~/Foo $ ls d/g
a b c d
Many commands that deal with files and file systems, like find
, have parameters specifically
telling the command whether to follow soft links or not (by default, find
does not - see the next
chapter for more).
\drfnd{find}{find files}
Most people think of diff
as a tool only programmers find
useful, but that is short-sighted. The whole purpose of diff
is to show differences between
files. For example, I backed up this document (which is a text file) before starting this section,
then typed this introduction to diff
. This is what diff
showed after I added the new paragraph:
\drfnd{diff}{show differences between files}
\index{files and directories!compare files (\texttt{diff} command)}
\drcap{\texttt{diff} example}
~ $ diff Step02.bak Step02.md
1285a1286,1291
> Most people think of [`diff`](http://linux.die.net/man/1/diff) as a tool
> only programmers find useful, but that is short-sighted. The whole purpose
> of `diff` is to show differences between files. For example, I backed up
> this document (which is a text file) before starting this chapter, then
> typed this introduction to `diff`. This is what `diff` shows:
In other words, the "arrows" are pointing to the "new" file (by convention the file specified on the left is the "old" file and the file on the right is the "new" file), showing five lines were inserted, starting at line 1285. Pretty meta, but not real exciting.
Let's look at something else, say a configuration file for an application. We have an original file,
orig.conf
:
\drcap{\texttt{orig.conf} file}
~ $ cat orig.conf
FOO=1
SOME=THINGS
STAY=THE
SAME=ALWAYS
BAR=Xyzzy
Then we have a new file, new.conf
:
\drcap{\texttt{new.conf} file}
~ $ cat new.conf
FOO=2
SOME=THINGS
STAY=THE
SAME=ALWAYS
Now if we diff
them:
\drcap{Using \texttt{diff} on config files}
~ $ diff orig.conf new.conf
1c1
< FOO=1
---
> FOO=2
7d6
< BAR=Xyzzy
Now we can more easily see that line #1 changed (1c1
) from FOO=1
on the "left" file to FOO=2
on the "right," and that line #7 was deleted (7d6
) from the "left" file to form the "right."
Again, not too interesting, but imagine that both files were thousands of lines long, and there
were only a few changes, and you were trying to detect and recover an accidentally-deleted line.
Now you can see why diff
can be handy, as long as you keep around a prior version either in a
backup file or version control system to compare against.
diff
is your friend. It really comes into play with a version control system like
git
, but again, that is beyond the scope of this book.
\drcmd{git}{distributed version control}