Filters in Linux are commands that process text streams and produce output that can be used in further processing. They are an integral part of shell scripting and command-line operations, especially when working with pipelines (|
), where the output of one command is used as input for another.
This page will cover some of the most commonly used filters in Linux, including cat
, grep
, sort
, uniq
, cut
, tr
, and sed
.
The cat
command reads files sequentially, writing their contents to the standard output. It’s often used to display the contents of files or to concatenate multiple files into one.
-
Display a File:
cat filename
-
Concatenate Multiple Files:
cat file1 file2 > combinedfile
-
Display Line Numbers:
cat -n filename
The grep
command searches through text using patterns (regular expressions). It’s commonly used to find specific lines in a file that match a pattern.
-
Search for a Pattern in a File:
grep "pattern" filename
-
Search Recursively in a Directory:
grep -r "pattern" directory/
-
Show Line Numbers:
grep -n "pattern" filename
-
Ignore Case:
grep -i "pattern" filename
The sort
command sorts lines in a file alphabetically or numerically.
-
Sort a File Alphabetically:
sort filename
-
Sort Numerically:
sort -n filename
-
Sort in Reverse Order:
sort -r filename
-
Sort by a Specific Field:
sort -k 2 filename
The uniq
command filters out repeated lines in a file, often used in conjunction with sort
.
-
Remove Duplicate Lines:
sort filename | uniq
-
Count Occurrences of Lines:
sort filename | uniq -c
-
Only Print Duplicate Lines:
sort filename | uniq -d
The cut
command extracts sections from each line of input, typically used to extract columns from text files or command outputs.
-
Extract the First Column:
cut -d ' ' -f 1 filename
-d ' '
specifies the delimiter (a space in this case).-f 1
specifies the first field (column).
-
Extract Multiple Columns:
cut -d ',' -f 1,3 filename
This example extracts the first and third columns from a CSV file.
The tr
command is used for translating or deleting characters from text input.
-
Convert Lowercase to Uppercase:
tr 'a-z' 'A-Z'
-
Delete Specific Characters:
tr -d 'a'
This removes all instances of the letter
a
from the input. -
Replace a Character:
tr ' ' '_'
This replaces all spaces with underscores.
The sed
command is a powerful stream editor for filtering and transforming text.
-
Substitute a Word:
sed 's/oldword/newword/' filename
s
indicates substitution.oldword
is the word to be replaced.newword
is the replacement word.
-
Substitute Globally on a Line:
sed 's/oldword/newword/g' filename
The
g
at the end makes the substitution global, replacing all instances ofoldword
on each line. -
Delete a Line Containing a Pattern:
sed '/pattern/d' filename
-
Print Only Matching Lines:
sed -n '/pattern/p' filename
Filters can be combined using pipes (|
) to perform more complex text processing tasks.
Imagine you have a log file and you want to extract all IP addresses, remove duplicates, and sort them:
grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' logfile | sort | uniq
grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+'
extracts all IP addresses.sort
sorts the IP addresses.uniq
removes duplicates.
Filters are essential tools in the Linux command-line environment, enabling powerful and flexible text processing. By mastering these commands, you can manipulate and transform data streams efficiently, making them invaluable for system administration, scripting, and data analysis tasks.
Next: Common Text Processing Tools
Previous: Other Editors: Nano, Emacs