Introduction to comm (Combines the functionality of diff and cmp)

Posted on
Introduction to comm (Combines the functionality of diff and cmp)

Introduction to comm (Combines the functionality of diff and cmp)

filesguide

comm is a command-line utility in Linux that combines the
functionality of the diff and cmp commands. It is
used to compare two sorted files line by line and display the lines that are
common, unique, or different between the two files. comm is a
useful tool for finding differences or similarities between two files and is
commonly used in scripting and automation tasks.

The comm command is written in C programming language and is
available as part of the GNU Core Utilities package. It is open-source
software and is distributed under the GNU General Public License (GPL).

Official page of comm (Combines the functionality of diff and cmp):
https://www.gnu.org/software/coreutils/manual/html_node/comm-invocation.html

Installation

comm is a part of the GNU Core Utilities package, which is
usually pre-installed on most Linux distributions. However, if it is not
available or you need to install it on a different operating system, you can
follow the steps below:

Ubuntu/Debian

sudo apt-get install coreutils

CentOS/RHEL

sudo yum install coreutils

Usage and Examples

The basic syntax of the comm command is:

comm [OPTION]... FILE1 FILE2

Here are some examples of how to use the comm command:

Example 1: Compare two sorted files and display common lines

comm file1.txt file2.txt

This command compares the two sorted files file1.txt and
file2.txt and displays the lines that are common to both files.

Example 2: Compare two sorted files and display unique lines

comm -23 file1.txt file2.txt

This command compares the two sorted files file1.txt and
file2.txt and displays the lines that are unique to
file1.txt. The -23 option suppresses the output of
common lines (-1) and lines unique to
file2.txt (-2).

Example 3: Compare two sorted files and display lines that are different

comm -3 file1.txt file2.txt

This command compares the two sorted files file1.txt and
file2.txt and displays the lines that are different between the
two files. The -3 option suppresses the output of common lines (
-1 and -2).

Similar Commands and Benefits

There are several other commands and tools available in Linux that serve a
similar purpose to comm. Some of them include:

diff

diff is a command-line utility that compares two files line by
line and displays the differences between them. Unlike
comm, diff does not require the files to be sorted
and provides a more detailed output of the differences.

cmp

cmp is a command-line utility that compares two files byte by
byte and displays the first byte and line number where the files differ. It
is useful for comparing binary files or large files where line-by-line
comparison is not practical.

The benefits of using comm over diff or
cmp include:

  • Efficiency: comm is optimized for comparing sorted files and
    can handle large files more efficiently than diff or
    cmp.
  • Simplicity: comm provides a simple and concise output that
    shows only the common, unique, or different lines between the files.
  • Flexibility: comm offers various options to customize the
    output and suppress specific lines, making it suitable for different use
    cases.

Script Examples

Here are three script examples that demonstrate the usage of comm
in automation:

Script 1: Find common lines between two files

#!/bin/bash

file1="file1.txt"
file2="file2.txt"

common_lines=$(comm -12 <(sort "$file1") <(sort "$file2"))

echo "Common lines between $file1 and $file2:"
echo "$common_lines"

This script compares the two files file1.txt and
file2.txt and displays the lines that are common to both files.

Script 2: Find unique lines in file1.txt

#!/bin/bash

file1="file1.txt"
file2="file2.txt"

unique_lines=$(comm -23 <(sort "$file1") <(sort "$file2"))

echo "Unique lines in $file1:"
echo "$unique_lines"

This script compares the two files file1.txt and
file2.txt and displays the lines that are unique to
file1.txt.

Script 3: Compare two files and output differences

#!/bin/bash

file1="file1.txt"
file2="file2.txt"

diff_lines=$(comm -3 <(sort "$file1") <(sort "$file2"))

echo "Lines that are different between $file1 and $file2:"
echo "$diff_lines"

This script compares the two files file1.txt and
file2.txt and displays the lines that are different between the
two files.

List of comm Functions and Constants

Function/Constant Description
comm The main comm command that compares two files and displays
the common, unique, or different lines.
-1 Suppress the output of lines unique to file1.
-2 Suppress the output of lines unique to file2.
-3 Suppress the output of common lines.
-12 Suppress the output of lines unique to file1 and lines unique to file2.
-23 Suppress the output of common lines and lines unique to file2.

Conclusion

The comm command in Linux is a powerful tool for comparing two
sorted files and finding common, unique, or different lines. It is widely
used in scripting and automation tasks to identify differences or similarities
between files. The simplicity, efficiency, and flexibility of
comm make it a preferred choice over other similar commands like
diff or cmp. Whether you are a developer, system
administrator, or data analyst, comm can help you streamline
your file comparison tasks and improve your productivity.



This article incorporates information and material from various online
sources. We acknowledge and appreciate the work of all original authors,
publishers, and websites. While every effort has been made to
appropriately credit the source material, any unintentional oversight or
omission does not constitute a copyright infringement. All trademarks,
logos, and images mentioned are the property of their respective owners. If
you believe that any content used in this article infringes upon your
copyright, please contact us immediately for review and prompt action.

This article is intended for informational and educational purposes only
and does not infringe on the rights of the copyright owners. If any
copyrighted material has been used without proper credit or in violation of
copyright laws, it is unintentional and we will rectify it promptly upon
notification. Please note that the republishing, redistribution, or
reproduction of part or all of the contents in any form is prohibited
without express written permission from the author and website owner. For
permissions or further inquiries, please contact us.

Key improvements and changes: