The re.sub
function in Python's re
module replaces occurrences of a pattern in a string with a specified replacement string. This function is useful for performing substitutions in text based on regular expression patterns.
Table of Contents
- Introduction
re.sub
Function Syntax- Examples
- Basic Usage
- Using Groups in Patterns
- Using Functions as Replacements
- Limiting the Number of Substitutions
- Real-World Use Case
- Conclusion
Introduction
The re.sub
function in Python's re
module allows you to replace occurrences of a regular expression pattern in a string with a specified replacement string. This can be particularly useful for tasks such as text cleanup, formatting, and data transformation.
re.sub Function Syntax
Here is how you use the re.sub
function:
import re
result = re.sub(pattern, repl, string, count=0, flags=0)
Parameters:
pattern
: The regular expression pattern to search for.repl
: The replacement string or a function that returns the replacement string.string
: The string to be processed.count
: Optional. The maximum number of pattern occurrences to replace. The default is0
, which means replace all occurrences.flags
: Optional. Flags that modify the behavior of the pattern, such asre.IGNORECASE
,re.MULTILINE
, etc.
Returns:
- A new string with the replacements made.
Examples
Basic Usage
Here is an example of how to use the re.sub
function to replace all digits in a string with a #
character.
Example
import re
# Replacing all digits in a string with '#'
result = re.sub(r'\d+', '#', 'There are 123 apples and 45 bananas.')
print(result)
Output:
There are # apples and # bananas.
Using Groups in Patterns
This example demonstrates how to use groups in a regular expression pattern to rearrange parts of the string.
Example
import re
# Reversing the order of day and month in a date string
result = re.sub(r'(\d{2})/(\d{2})/(\d{4})', r'\2/\1/\3', 'Today is 12/31/2021.')
print(result)
Output:
Today is 31/12/2021.
Using Functions as Replacements
This example demonstrates how to use a function as the replacement argument to perform more complex substitutions.
Example
import re
# Function to replace digits with their squared value
def square(match):
return str(int(match.group()) ** 2)
# Replacing digits in a string with their squared value
result = re.sub(r'\d+', square, 'The numbers are 2, 3, and 4.')
print(result)
Output:
The numbers are 4, 9, and 16.
Limiting the Number of Substitutions
This example demonstrates how to limit the number of substitutions using the count
parameter.
Example
import re
# Replacing only the first two occurrences of a digit with '#'
result = re.sub(r'\d+', '#', 'There are 123 apples, 45 bananas, and 67 cherries.', count=2)
print(result)
Output:
There are # apples, # bananas, and 67 cherries.
Real-World Use Case
Formatting Phone Numbers
In real-world applications, the re.sub
function can be used to format phone numbers by replacing various delimiters with a consistent format.
Example
import re
def format_phone_number(phone):
pattern = r'(\d{3})[-.\s]*(\d{3})[-.\s]*(\d{4})'
return re.sub(pattern, r'(\1) \2-\3', phone)
# Example usage
phone_numbers = ['123-456-7890', '123.456.7890', '123 456 7890']
formatted_numbers = [format_phone_number(phone) for phone in phone_numbers]
print(formatted_numbers)
Output:
['(123) 456-7890', '(123) 456-7890', '(123) 456-7890']
Conclusion
The re.sub
function in Python's re
module replaces occurrences of a pattern in a string with a specified replacement string. This function is useful for performing substitutions in text based on regular expression patterns. Proper usage of this function can enhance the flexibility and power of your text processing tasks in Python.
Comments
Post a Comment
Leave Comment