Python glob Module

The glob module in Python provides a convenient way to search for files matching a specified pattern. It uses Unix shell-style wildcards for pattern matching, making it easy to locate files and directories.

Table of Contents

  1. Introduction
  2. Key Functions
    • glob
    • iglob
  3. Wildcards
    • *
    • ?
    • []
  4. Examples
    • Basic Usage
    • Recursive Search
    • Using iglob for Memory Efficiency
  5. Real-World Use Case
  6. Conclusion
  7. References

Introduction

The glob module allows for pattern matching and file path expansion using Unix shell-style wildcards. It simplifies the process of locating files and directories that match a specific pattern, making it used for file manipulation tasks.

Key Functions

glob

Returns a list of paths matching a pathname pattern.

import glob

# Get all .txt files in the current directory
files = glob.glob('*.txt')
print(files)  # ['file1.txt', 'file2.txt']

iglob

Returns an iterator which yields the same values as glob() without storing them all simultaneously.

import glob

# Get all .txt files in the current directory using an iterator
for file in glob.iglob('*.txt'):
    print(file)

Wildcards

*

Matches zero or more characters.

import glob

# Match all .txt files
files = glob.glob('*.txt')
print(files)  # ['file1.txt', 'file2.txt']

?

Matches exactly one character.

import glob

# Match all .txt files with a single character prefix
files = glob.glob('?.txt')
print(files)  # ['a.txt', 'b.txt']

[]

Matches any one of the enclosed characters.

import glob

# Match all .txt files starting with either a or b
files = glob.glob('[ab]*.txt')
print(files)  # ['a.txt', 'b.txt', 'ab.txt']

Examples

Basic Usage

import glob

# Get all Python files in the current directory
python_files = glob.glob('*.py')
print(python_files)  # ['script1.py', 'script2.py']

Recursive Search

import glob

# Get all .txt files in the current directory and subdirectories
files = glob.glob('**/*.txt', recursive=True)
print(files)  # ['dir1/file1.txt', 'dir2/file2.txt', 'dir1/dir3/file3.txt']

Using iglob for Memory Efficiency

import glob

# Use iglob to iterate over matching files without loading them all at once
for file in glob.iglob('**/*.py', recursive=True):
    print(file)

Real-World Use Case

Finding and Processing Log Files

import glob

# Get all log files in the logs directory and its subdirectories
log_files = glob.glob('logs/**/*.log', recursive=True)

# Process each log file
for log_file in log_files:
    with open(log_file, 'r') as f:
        content = f.read()
        # Perform some processing on the content
        print(f"Processing {log_file}: {content[:100]}...")  # Print the first 100 characters of each log file

Conclusion

The glob module in Python provides a powerful and flexible way to search for files matching specific patterns. It simplifies tasks related to file manipulation and allows for efficient processing of large numbers of files.

References

Comments

Spring Boot 3 Paid Course Published for Free
on my Java Guides YouTube Channel

Subscribe to my YouTube Channel (165K+ subscribers):
Java Guides Channel

Top 10 My Udemy Courses with Huge Discount:
Udemy Courses - Ramesh Fadatare