The glob
module in Python provides a convenient way to search for files matching a specified pattern. It uses Unix shell-style wildcards for pattern matching, making it easy to locate files and directories.
Table of Contents
- Introduction
- Key Functions
glob
iglob
- Wildcards
*
?
[]
- Examples
- Basic Usage
- Recursive Search
- Using
iglob
for Memory Efficiency
- Real-World Use Case
- Conclusion
- References
Introduction
The glob
module allows for pattern matching and file path expansion using Unix shell-style wildcards. It simplifies the process of locating files and directories that match a specific pattern, making it used for file manipulation tasks.
Key Functions
glob
Returns a list of paths matching a pathname pattern.
import glob
# Get all .txt files in the current directory
files = glob.glob('*.txt')
print(files) # ['file1.txt', 'file2.txt']
iglob
Returns an iterator which yields the same values as glob()
without storing them all simultaneously.
import glob
# Get all .txt files in the current directory using an iterator
for file in glob.iglob('*.txt'):
print(file)
Wildcards
*
Matches zero or more characters.
import glob
# Match all .txt files
files = glob.glob('*.txt')
print(files) # ['file1.txt', 'file2.txt']
?
Matches exactly one character.
import glob
# Match all .txt files with a single character prefix
files = glob.glob('?.txt')
print(files) # ['a.txt', 'b.txt']
[]
Matches any one of the enclosed characters.
import glob
# Match all .txt files starting with either a or b
files = glob.glob('[ab]*.txt')
print(files) # ['a.txt', 'b.txt', 'ab.txt']
Examples
Basic Usage
import glob
# Get all Python files in the current directory
python_files = glob.glob('*.py')
print(python_files) # ['script1.py', 'script2.py']
Recursive Search
import glob
# Get all .txt files in the current directory and subdirectories
files = glob.glob('**/*.txt', recursive=True)
print(files) # ['dir1/file1.txt', 'dir2/file2.txt', 'dir1/dir3/file3.txt']
Using iglob for Memory Efficiency
import glob
# Use iglob to iterate over matching files without loading them all at once
for file in glob.iglob('**/*.py', recursive=True):
print(file)
Real-World Use Case
Finding and Processing Log Files
import glob
# Get all log files in the logs directory and its subdirectories
log_files = glob.glob('logs/**/*.log', recursive=True)
# Process each log file
for log_file in log_files:
with open(log_file, 'r') as f:
content = f.read()
# Perform some processing on the content
print(f"Processing {log_file}: {content[:100]}...") # Print the first 100 characters of each log file
Conclusion
The glob
module in Python provides a powerful and flexible way to search for files matching specific patterns. It simplifies tasks related to file manipulation and allows for efficient processing of large numbers of files.
Comments
Post a Comment
Leave Comment