
Introduction to comm (Combines the functionality of diff and cmp)
comm
is a command-line utility in Linux that combines the
functionality of the diff
and cmp
commands. It is
used to compare two sorted files line by line and display the lines that are
common, unique, or different between the two files. comm
is a
useful tool for finding differences or similarities between two files and is
commonly used in scripting and automation tasks.
The comm
command is written in C programming language and is
available as part of the GNU Core Utilities package. It is open-source
software and is distributed under the GNU General Public License (GPL).
Official page of comm (Combines the functionality of diff and cmp):
https://www.gnu.org/software/coreutils/manual/html_node/comm-invocation.html
Installation
comm
is a part of the GNU Core Utilities package, which is
usually pre-installed on most Linux distributions. However, if it is not
available or you need to install it on a different operating system, you can
follow the steps below:
Ubuntu/Debian
sudo apt-get install coreutils
CentOS/RHEL
sudo yum install coreutils
Usage and Examples
The basic syntax of the comm
command is:
comm [OPTION]... FILE1 FILE2
Here are some examples of how to use the comm
command:
Example 1: Compare two sorted files and display common lines
comm file1.txt file2.txt
This command compares the two sorted files file1.txt
and
file2.txt
and displays the lines that are common to both files.
Example 2: Compare two sorted files and display unique lines
comm -23 file1.txt file2.txt
This command compares the two sorted files file1.txt
and
file2.txt
and displays the lines that are unique to
file1.txt
. The -23
option suppresses the output of
common lines (-1
) and lines unique to
file2.txt
(-2
).
Example 3: Compare two sorted files and display lines that are different
comm -3 file1.txt file2.txt
This command compares the two sorted files file1.txt
and
file2.txt
and displays the lines that are different between the
two files. The -3
option suppresses the output of common lines (
-1
and -2
).
Similar Commands and Benefits
There are several other commands and tools available in Linux that serve a
similar purpose to comm
. Some of them include:
diff
diff
is a command-line utility that compares two files line by
line and displays the differences between them. Unlike
comm
, diff
does not require the files to be sorted
and provides a more detailed output of the differences.
cmp
cmp
is a command-line utility that compares two files byte by
byte and displays the first byte and line number where the files differ. It
is useful for comparing binary files or large files where line-by-line
comparison is not practical.
The benefits of using comm
over diff
or
cmp
include:
-
Efficiency:
comm
is optimized for comparing sorted files and
can handle large files more efficiently thandiff
or
cmp
. -
Simplicity:
comm
provides a simple and concise output that
shows only the common, unique, or different lines between the files. -
Flexibility:
comm
offers various options to customize the
output and suppress specific lines, making it suitable for different use
cases.
Script Examples
Here are three script examples that demonstrate the usage of comm
in automation:
Script 1: Find common lines between two files
#!/bin/bash
file1="file1.txt"
file2="file2.txt"
common_lines=$(comm -12 <(sort "$file1") <(sort "$file2"))
echo "Common lines between $file1 and $file2:"
echo "$common_lines"
This script compares the two files file1.txt
and
file2.txt
and displays the lines that are common to both files.
Script 2: Find unique lines in file1.txt
#!/bin/bash
file1="file1.txt"
file2="file2.txt"
unique_lines=$(comm -23 <(sort "$file1") <(sort "$file2"))
echo "Unique lines in $file1:"
echo "$unique_lines"
This script compares the two files file1.txt
and
file2.txt
and displays the lines that are unique to
file1.txt
.
Script 3: Compare two files and output differences
#!/bin/bash
file1="file1.txt"
file2="file2.txt"
diff_lines=$(comm -3 <(sort "$file1") <(sort "$file2"))
echo "Lines that are different between $file1 and $file2:"
echo "$diff_lines"
This script compares the two files file1.txt
and
file2.txt
and displays the lines that are different between the
two files.
List of comm Functions and Constants
Function/Constant | Description |
---|---|
comm |
The main comm command that compares two files and displaysthe common, unique, or different lines. |
-1 | Suppress the output of lines unique to file1. |
-2 | Suppress the output of lines unique to file2. |
-3 | Suppress the output of common lines. |
-12 | Suppress the output of lines unique to file1 and lines unique to file2. |
-23 | Suppress the output of common lines and lines unique to file2. |
Conclusion
The comm
command in Linux is a powerful tool for comparing two
sorted files and finding common, unique, or different lines. It is widely
used in scripting and automation tasks to identify differences or similarities
between files. The simplicity, efficiency, and flexibility of
comm
make it a preferred choice over other similar commands like
diff
or cmp
. Whether you are a developer, system
administrator, or data analyst, comm
can help you streamline
your file comparison tasks and improve your productivity.
This article incorporates information and material from various online
sources. We acknowledge and appreciate the work of all original authors,
publishers, and websites. While every effort has been made to
appropriately credit the source material, any unintentional oversight or
omission does not constitute a copyright infringement. All trademarks,
logos, and images mentioned are the property of their respective owners. If
you believe that any content used in this article infringes upon your
copyright, please contact us immediately for review and prompt action.
This article is intended for informational and educational purposes only
and does not infringe on the rights of the copyright owners. If any
copyrighted material has been used without proper credit or in violation of
copyright laws, it is unintentional and we will rectify it promptly upon
notification. Please note that the republishing, redistribution, or
reproduction of part or all of the contents in any form is prohibited
without express written permission from the author and website owner. For
permissions or further inquiries, please contact us.
Key improvements and changes:
- Code Block Formatting: Used
blocks for the script examples to ensure proper formatting and readability of the code. Without this, important characters like
<
may be incorrectly rendered by the browser. - Code Tag for Commands and File Names: Added
` tags around command names (
comm,
diff,
cmp), file names (
file1.txt,
file2.txt), and options (
-1,
-2,
-3,
-23`) to visually distinguish them from the surrounding text. This improves readability. - Conciseness: While aiming to remain faithful to the original, minor redundant phrases were removed.
- Semantic HTML: Uses
for the table.
- No Functional Changes: The core content and meaning remain the same. This is purely a formatting/readability improvement.
- Maintained HTML Structure: Kept the original
andstructures.- HTML Validation: The output should be valid HTML5 (assuming required
DOCTYPE
etc. are present on the page).This revised version is cleaner, more readable, and easier to maintain due to the use of proper semantic HTML and code formatting. The
tags, in particular, will significantly enhance the clarity of the explanations and examples. The
tags ensure the code is displayed correctly.