In Linux, almost everything is text: configuration files, logs, scripts, and even the output of system commands. That's why being able to effectively process text data through the command line is a skill that can greatly speed up your workflow and make you more confident in system administration and task automation.
This article collects frequently used commands for analyzing, filtering, comparing, and transforming text data in Linux. Each command comes with explanations and examples — save this as a handy cheat sheet!
Viewing file contents
cat -n file1
Displays the contents of a file with line numbers on the left — useful for debugging and referencing.
cat example.txt | awk 'NR%2==1'
Filters and displays only the odd-numbered lines of a file — useful for parsing logs or working with templates.
Extracting specific columns
echo a b c | awk '{print $1,$3}'
Prints the first and third parts of the line (space-separated) — helps extract needed elements from structured data.
echo a b c | awk '{print $1}'
Prints only the first element of the line — a minimal sample for analysis.
Comparing files for differences
comm -3 file1 file2
Displays the differing lines between two files, excluding common lines — convenient for syncing configuration files.
comm -1 file1 file2
Displays only lines that are missing from file2 — used for comparing original and modified files.
comm -2 file1 file2
Shows lines that are missing from file1 but present in file2 — helpful for tracking added entries.
sdiff file1 file2
Displays a side-by-side line-by-line comparison of two files — great for spotting exact differences.
Searching within text files
grep [0-9] /var/log/messages
Finds lines that contain at least one digit — useful for quickly identifying entries with IDs, codes, etc.
grep ^Aug /var/log/messages
Filters lines that begin with "Aug" — handy when working with logs by date.
grep Aug /var/log/messages
Displays all lines that contain "Aug" anywhere — a quick keyword search.
grep Aug -R /var/log/*
Recursively searches all files in the directory for the string "Aug" — powerful for scanning logs or configs.
Merging files line by line
paste -d '+' file1 file2
Joins lines from two files using the "+" symbol as a delimiter — creates a compact data view.
paste file1 file2
Combines corresponding lines from two files side by side — useful for comparing data.
Editing text with sed
sed 's/string1/string2/g' example.txt
Performs a global replacement of string1 with string2 throughout the file — basic pattern substitution.
sed '/ *#/d; /^$/d' example.txt
Deletes empty lines and comment lines (starting with #) — useful for cleaning up config files.
sed '/^$/d' example.txt
Deletes only empty lines — condenses the text without affecting content.
sed -e '1d' example.txt
Deletes the first line of the file — can be used to skip headers.
sed -n '/string1/p'
Displays only lines that contain string1 — a quick way to extract by pattern.
sed -e 's/string//g' example.txt
Removes all occurrences of string from the file — used for stripping out keywords or noise.
sed -e 's/ *$//' example.txt
Removes trailing spaces from the end of lines — helpful for cleanup before sharing files.
sed -n '5p;5q' example.txt
Displays only the fifth line of the file — point selection equivalent.
sed -n '2,5p' example.txt
Prints lines 2 through 5 inclusive — useful when analyzing a portion of the file.
sed -e 's/00*/0/g' example.txt
Replaces any sequence of zeros with a single zero — great for normalizing numeric data.
Sorting and filtering lines
sort file1 file2
Sorts the combined content of two files alphabetically or numerically — a basic analysis operation.
sort file1 file2 | uniq
Removes duplicate lines after sorting — cleans up repeated data.
sort file1 file2 | uniq -u
Displays only unique lines — helps find entries that occur just once.
sort file1 file2 | uniq -d
Shows only repeated lines — useful for identifying duplicates.
Changing character case
echo 'word' | tr '[:lower:]' '[:upper:]'
Converts all characters to uppercase — useful for data normalization.
These commands can be used individually or combined into powerful data processing pipelines. For example, you can read a file, filter lines by keywords, sort and remove duplicates — all in a single terminal command!