Python re.sub Function

The re.sub function in Python's re module replaces occurrences of a pattern in a string with a specified replacement string. This function is useful for performing substitutions in text based on regular expression patterns.

Table of Contents

  1. Introduction
  2. re.sub Function Syntax
  3. Examples
    • Basic Usage
    • Using Groups in Patterns
    • Using Functions as Replacements
    • Limiting the Number of Substitutions
  4. Real-World Use Case
  5. Conclusion

Introduction

The re.sub function in Python's re module allows you to replace occurrences of a regular expression pattern in a string with a specified replacement string. This can be particularly useful for tasks such as text cleanup, formatting, and data transformation.

re.sub Function Syntax

Here is how you use the re.sub function:

import re

result = re.sub(pattern, repl, string, count=0, flags=0)

Parameters:

  • pattern: The regular expression pattern to search for.
  • repl: The replacement string or a function that returns the replacement string.
  • string: The string to be processed.
  • count: Optional. The maximum number of pattern occurrences to replace. The default is 0, which means replace all occurrences.
  • flags: Optional. Flags that modify the behavior of the pattern, such as re.IGNORECASE, re.MULTILINE, etc.

Returns:

  • A new string with the replacements made.

Examples

Basic Usage

Here is an example of how to use the re.sub function to replace all digits in a string with a # character.

Example

import re

# Replacing all digits in a string with '#'
result = re.sub(r'\d+', '#', 'There are 123 apples and 45 bananas.')
print(result)

Output:

There are # apples and # bananas.

Using Groups in Patterns

This example demonstrates how to use groups in a regular expression pattern to rearrange parts of the string.

Example

import re

# Reversing the order of day and month in a date string
result = re.sub(r'(\d{2})/(\d{2})/(\d{4})', r'\2/\1/\3', 'Today is 12/31/2021.')
print(result)

Output:

Today is 31/12/2021.

Using Functions as Replacements

This example demonstrates how to use a function as the replacement argument to perform more complex substitutions.

Example

import re

# Function to replace digits with their squared value
def square(match):
    return str(int(match.group()) ** 2)

# Replacing digits in a string with their squared value
result = re.sub(r'\d+', square, 'The numbers are 2, 3, and 4.')
print(result)

Output:

The numbers are 4, 9, and 16.

Limiting the Number of Substitutions

This example demonstrates how to limit the number of substitutions using the count parameter.

Example

import re

# Replacing only the first two occurrences of a digit with '#'
result = re.sub(r'\d+', '#', 'There are 123 apples, 45 bananas, and 67 cherries.', count=2)
print(result)

Output:

There are # apples, # bananas, and 67 cherries.

Real-World Use Case

Formatting Phone Numbers

In real-world applications, the re.sub function can be used to format phone numbers by replacing various delimiters with a consistent format.

Example

import re

def format_phone_number(phone):
    pattern = r'(\d{3})[-.\s]*(\d{3})[-.\s]*(\d{4})'
    return re.sub(pattern, r'(\1) \2-\3', phone)

# Example usage
phone_numbers = ['123-456-7890', '123.456.7890', '123 456 7890']
formatted_numbers = [format_phone_number(phone) for phone in phone_numbers]
print(formatted_numbers)

Output:

['(123) 456-7890', '(123) 456-7890', '(123) 456-7890']

Conclusion

The re.sub function in Python's re module replaces occurrences of a pattern in a string with a specified replacement string. This function is useful for performing substitutions in text based on regular expression patterns. Proper usage of this function can enhance the flexibility and power of your text processing tasks in Python.

Comments

Spring Boot 3 Paid Course Published for Free
on my Java Guides YouTube Channel

Subscribe to my YouTube Channel (165K+ subscribers):
Java Guides Channel

Top 10 My Udemy Courses with Huge Discount:
Udemy Courses - Ramesh Fadatare