Python gzip Module

The gzip module in Python provides a simple interface for compressing and decompressing files using the GZIP format. This module is useful for working with large files that need to be compressed to save storage space or to be transferred over a network.

Table of Contents

  1. Introduction
  2. Key Functions and Classes
    • open
    • compress
    • decompress
    • GzipFile
  3. Examples
    • Compressing and Decompressing Files
    • Reading and Writing Gzipped Data
    • Using GzipFile Class
  4. Real-World Use Case
  5. Conclusion
  6. References

Introduction

The gzip module provides the ability to read and write GZIP files. The GZIP format is a popular file compression format that reduces the size of files without losing any information. It is commonly used in web protocols and for compressing large files.

Key Functions and Classes

open

Opens a GZIP-compressed file in binary or text mode.

import gzip

with gzip.open('file.txt.gz', 'wb') as f:
    f.write(b'This is a test.')

with gzip.open('file.txt.gz', 'rb') as f:
    file_content = f.read()
    print(file_content)  # b'This is a test.'

compress

Compresses the data provided and returns the compressed byte sequence.

import gzip

data = b'This is a test.'
compressed_data = gzip.compress(data)
print(compressed_data)

decompressed_data = gzip.decompress(compressed_data)
print(decompressed_data)  # b'This is a test.'

decompress

Decompresses the data provided and returns the decompressed byte sequence.

import gzip

data = b'This is a test.'
compressed_data = gzip.compress(data)
decompressed_data = gzip.decompress(compressed_data)
print(decompressed_data)  # b'This is a test.'

GzipFile

Class for reading and writing GZIP files.

import gzip

# Writing using GzipFile
with gzip.GzipFile('file.txt.gz', 'wb') as f:
    f.write(b'This is a test.')

# Reading using GzipFile
with gzip.GzipFile('file.txt.gz', 'rb') as f:
    file_content = f.read()
    print(file_content)  # b'This is a test.'

Examples

Compressing and Decompressing Files

import gzip

# Compressing a file
with open('file.txt', 'rb') as f_in:
    with gzip.open('file.txt.gz', 'wb') as f_out:
        f_out.writelines(f_in)

# Decompressing a file
with gzip.open('file.txt.gz', 'rb') as f_in:
    with open('file_decompressed.txt', 'wb') as f_out:
        f_out.writelines(f_in)

Reading and Writing Gzipped Data

import gzip

# Writing gzipped data
with gzip.open('file.txt.gz', 'wt', encoding='utf-8') as f:
    f.write('This is a test.')

# Reading gzipped data
with gzip.open('file.txt.gz', 'rt', encoding='utf-8') as f:
    file_content = f.read()
    print(file_content)  # This is a test.

Using GzipFile Class

import gzip

# Writing using GzipFile class
with gzip.GzipFile('file.txt.gz', 'wb') as f:
    f.write(b'This is a test.')

# Reading using GzipFile class
with gzip.GzipFile('file.txt.gz', 'rb') as f:
    file_content = f.read()
    print(file_content)  # b'This is a test.'

Real-World Use Case

Compressing Log Files

import gzip
import shutil
import os

def compress_log_file(log_file):
    with open(log_file, 'rb') as f_in:
        with gzip.open(log_file + '.gz', 'wb') as f_out:
            shutil.copyfileobj(f_in, f_out)
    os.remove(log_file)

def decompress_log_file(gz_file):
    with gzip.open(gz_file, 'rb') as f_in:
        with open(gz_file[:-3], 'wb') as f_out:
            shutil.copyfileobj(f_in, f_out)

# Compress a log file
compress_log_file('application.log')

# Decompress the log file
decompress_log_file('application.log.gz')

Conclusion

The gzip module in Python provides a convenient way to compress and decompress files using the GZIP format. It supports both file operations and in-memory data compression, making it used for managing large datasets and saving storage space.

References

Comments

Spring Boot 3 Paid Course Published for Free
on my Java Guides YouTube Channel

Subscribe to my YouTube Channel (165K+ subscribers):
Java Guides Channel

Top 10 My Udemy Courses with Huge Discount:
Udemy Courses - Ramesh Fadatare