π Regular Expressions (Regex) in Python#
Regular expressions are powerful tools to search, match, and manipulate text using patterns. They help you find patterns in strings like emails, phone numbers, or specific words. Pythonβs re
module makes working with regex easy! Letβs learn the basics step-by-step! β¨
What is a Regular Expression? π€#
Itβs a special sequence of characters that describe a search pattern. For example, you could use regex to find all email addresses in a document or validate user input.
1. Importing the re
Module π§©#
import re
2. Basic Regex Functions βοΈ#
Function | Description | Example |
---|---|---|
re.search() | Search for a pattern in a string | Finds first match |
re.findall() | Find all matches in a string | Returns all matched substrings |
re.match() | Matches pattern at start of string | Checks only the beginning |
re.sub() | Replace matched parts with new text | Substitute text |
3. Simple Patterns π#
Pattern | Meaning | Example |
---|---|---|
. | Any character except newline | a.c matches ‘abc’ |
\d | Digit (0-9) | \d\d matches ‘42’ |
\w | Word character (letters, digits, _) | \w+ matches ‘Hello’ |
+ | One or more repetitions | a+ matches ‘aaa’ |
* | Zero or more repetitions | a* matches ‘’, ‘a’, ‘aaaa’ |
? | Optional (0 or 1) | ca?t matches ‘cat’ or ‘ct’ |
4. Example: Find All Numbers in a Text#
text = "I have 2 apples and 15 bananas."
numbers = re.findall(r'\d+', text)
print(numbers) # ['2', '15']
5. Example: Validate an Email Address#
email = "student@example.com"
pattern = r'^[\w\.-]+@[\w\.-]+\.\w+$'
if re.match(pattern, email):
print("Valid email!")
else:
print("Invalid email!")
6. Replacing Text#
Replace all vowels in a text with *
:
text = "Hello World"
new_text = re.sub(r'[aeiouAEIOU]', '*', text)
print(new_text) # H*ll* W*rld
Practice Questions with Solutions π#
- Extract all words that start with “py” from a sentence.
- Validate if a phone number is in the format
xxx-xxx-xxxx
. - Replace all whitespace in a string with underscores (
_
).
Mini Project: Extract Dates from Text π #
Find all dates in the format dd-mm-yyyy
from a paragraph.
text = "John was born on 12-05-1990, and his sister on 25-12-1995."
dates = re.findall(r'\b\d{2}-\d{2}-\d{4}\b', text)
print("Dates found:", dates)
Checklist for This Chapter β #
- Imported and used Pythonβs
re
module - Used basic regex functions (
search
,findall
,match
,sub
) - Understood common regex symbols and patterns
- Performed text searching, matching, and replacing with regex
Regular expressions are a fantastic skill for text processing, data validation, and scraping. Keep practicing and soon youβll write complex pattern searches with ease! π