Skip to content

Latest commit

 

History

History
226 lines (140 loc) · 4.86 KB

File metadata and controls

226 lines (140 loc) · 4.86 KB

Filters

Filters in Linux are commands that process text streams and produce output that can be used in further processing. They are an integral part of shell scripting and command-line operations, especially when working with pipelines (|), where the output of one command is used as input for another.

This page will cover some of the most commonly used filters in Linux, including cat, grep, sort, uniq, cut, tr, and sed.

cat: Concatenate and Display Files

The cat command reads files sequentially, writing their contents to the standard output. It’s often used to display the contents of files or to concatenate multiple files into one.

Basic Usage

  • Display a File:

    cat filename
  • Concatenate Multiple Files:

    cat file1 file2 > combinedfile
  • Display Line Numbers:

    cat -n filename

grep: Search Text Using Patterns

The grep command searches through text using patterns (regular expressions). It’s commonly used to find specific lines in a file that match a pattern.

Basic Usage

  • Search for a Pattern in a File:

    grep "pattern" filename
  • Search Recursively in a Directory:

    grep -r "pattern" directory/
  • Show Line Numbers:

    grep -n "pattern" filename
  • Ignore Case:

    grep -i "pattern" filename

sort: Sort Text

The sort command sorts lines in a file alphabetically or numerically.

Basic Usage

  • Sort a File Alphabetically:

    sort filename
  • Sort Numerically:

    sort -n filename
  • Sort in Reverse Order:

    sort -r filename
  • Sort by a Specific Field:

    sort -k 2 filename

uniq: Report or Omit Repeated Lines

The uniq command filters out repeated lines in a file, often used in conjunction with sort.

Basic Usage

  • Remove Duplicate Lines:

    sort filename | uniq
  • Count Occurrences of Lines:

    sort filename | uniq -c
  • Only Print Duplicate Lines:

    sort filename | uniq -d

cut: Remove Sections from Each Line

The cut command extracts sections from each line of input, typically used to extract columns from text files or command outputs.

Basic Usage

  • Extract the First Column:

    cut -d ' ' -f 1 filename
    • -d ' ' specifies the delimiter (a space in this case).
    • -f 1 specifies the first field (column).
  • Extract Multiple Columns:

    cut -d ',' -f 1,3 filename

    This example extracts the first and third columns from a CSV file.

tr: Translate or Delete Characters

The tr command is used for translating or deleting characters from text input.

Basic Usage

  • Convert Lowercase to Uppercase:

    tr 'a-z' 'A-Z'
  • Delete Specific Characters:

    tr -d 'a'

    This removes all instances of the letter a from the input.

  • Replace a Character:

    tr ' ' '_'

    This replaces all spaces with underscores.

sed: Stream Editor

The sed command is a powerful stream editor for filtering and transforming text.

Basic Usage

  • Substitute a Word:

    sed 's/oldword/newword/' filename
    • s indicates substitution.
    • oldword is the word to be replaced.
    • newword is the replacement word.
  • Substitute Globally on a Line:

    sed 's/oldword/newword/g' filename

    The g at the end makes the substitution global, replacing all instances of oldword on each line.

  • Delete a Line Containing a Pattern:

    sed '/pattern/d' filename
  • Print Only Matching Lines:

    sed -n '/pattern/p' filename

Combining Filters

Filters can be combined using pipes (|) to perform more complex text processing tasks.

Example: Extracting and Sorting Data

Imagine you have a log file and you want to extract all IP addresses, remove duplicates, and sort them:

grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' logfile | sort | uniq
  • grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' extracts all IP addresses.
  • sort sorts the IP addresses.
  • uniq removes duplicates.

Conclusion

Filters are essential tools in the Linux command-line environment, enabling powerful and flexible text processing. By mastering these commands, you can manipulate and transform data streams efficiently, making them invaluable for system administration, scripting, and data analysis tasks.


Next: Common Text Processing Tools

Previous: Other Editors: Nano, Emacs