Python filecmp Module

The filecmp module in Python provides functions to compare files and directories. It can be used to check if files or directories are identical, and it also supports recursive comparison of directories.

Table of Contents

  1. Introduction
  2. Key Classes and Functions
    • cmp
    • cmpfiles
    • dircmp
  3. Examples
    • Comparing Two Files
    • Comparing Files in Two Directories
    • Directory Comparison with dircmp
  4. Real-World Use Case
  5. Conclusion
  6. References

Introduction

The filecmp module offers functionalities to compare files and directories efficiently. It supports both shallow and deep comparisons, enabling users to determine whether files or directories are the same.

Key Classes and Functions

cmp

Compares two files.

import filecmp

result = filecmp.cmp('file1.txt', 'file2.txt')
print(result)  # True or False

cmpfiles

Compares files in two directories.

import filecmp

dir1 = 'dir1'
dir2 = 'dir2'
common_files = ['file1.txt', 'file2.txt']
match, mismatch, errors = filecmp.cmpfiles(dir1, dir2, common_files)
print(f"Match: {match}")
print(f"Mismatch: {mismatch}")
print(f"Errors: {errors}")

dircmp

Compares two directories.

import filecmp

d = filecmp.dircmp('dir1', 'dir2')
d.report()

Examples

Comparing Two Files

import filecmp

# Compare two files
file1 = 'file1.txt'
file2 = 'file2.txt'

if filecmp.cmp(file1, file2, shallow=False):
    print(f"{file1} and {file2} are identical")
else:
    print(f"{file1} and {file2} are different")

Comparing Files in Two Directories

import filecmp

dir1 = 'dir1'
dir2 = 'dir2'
common_files = ['file1.txt', 'file2.txt']

match, mismatch, errors = filecmp.cmpfiles(dir1, dir2, common_files)

print(f"Match: {match}")
print(f"Mismatch: {mismatch}")
print(f"Errors: {errors}")

Directory Comparison with dircmp

import filecmp

# Create a dircmp object
d = filecmp.dircmp('dir1', 'dir2')

# Print a comparison report
d.report()

# Access attributes of dircmp object
print(f"Common files: {d.common_files}")
print(f"Files only in dir1: {d.left_only}")
print(f"Files only in dir2: {d.right_only}")
print(f"Common directories: {d.common_dirs}")

Real-World Use Case

Synchronizing Directories

import filecmp
import os
import shutil

def sync_directories(dir1, dir2):
    d = filecmp.dircmp(dir1, dir2)
    
    # Copy files from dir1 to dir2
    for file_name in d.left_only:
        full_file_name = os.path.join(dir1, file_name)
        if os.path.isfile(full_file_name):
            shutil.copy(full_file_name, dir2)
    
    # Delete files from dir2 not in dir1
    for file_name in d.right_only:
        full_file_name = os.path.join(dir2, file_name)
        if os.path.isfile(full_file_name):
            os.remove(full_file_name)
    
    # Recursively sync common directories
    for common_dir in d.common_dirs:
        sync_directories(os.path.join(dir1, common_dir), os.path.join(dir2, common_dir))

# Synchronize two directories
sync_directories('dir1', 'dir2')

Conclusion

The filecmp module in Python provides an efficient way to compare files and directories. It can be used for various purposes, such as verifying backups, synchronizing directories, and checking for changes in files.

References

Comments

Spring Boot 3 Paid Course Published for Free
on my Java Guides YouTube Channel

Subscribe to my YouTube Channel (165K+ subscribers):
Java Guides Channel

Top 10 My Udemy Courses with Huge Discount:
Udemy Courses - Ramesh Fadatare