![]() |
VOOZH | about |
Email pattern matching is a common text‑processing task used to validate, extract or filter email addresses from raw text data. In Python, this is efficiently handled using Regex module, which allow precise pattern definition and matching based on email syntax rules. It is useful for data cleaning, form validation and log analysis.
A regular expression is used to check whether the given string matches a valid email address format. The re.match() function helps us to match the pattern from the beginning of the string to confirm the presence of a correctly structured email.
Output:
<re.Match object; span=(0, 13), match='eshant@gfg.in'>
Since we received a match object, it confirms that the email exists in the text.
To extract only email addresses from a given text the findall() function is used. It returns all substrings that match the specified email pattern in the text.
Output:
['eshant@gfg.in']
Here findall() returns only the matched email because the pattern is accurate.
To match email addresses belonging to a specific domain the domain name is explicitly defined in the regex pattern. This ensures that only emails ending with that domain are matched.
Output:
<re.Match object; span=(0, 14), match='eshant@gfg.org'>
None
The first email matches because it belongs to gfg.org, while the second one does not.
When multiple domains are permitted the OR operator (|) is used within parentheses to match any one of the specified domains.
Output:
<re.Match object; span=(0, 14), match='eshant@gfg.org'>
<re.Match object; span=(0, 14), match='eshant@gfg.net'>
None
Emails with .org and .net are matched successfully, while .com is rejected because it is not included in the pattern.
Both re.match() and re.search() are used to find patterns in text but they behave differently. re.match() checks for a match only at the beginning of the string, while re.search() scans the entire string for the pattern.
Output:
None
<re.Match object; span=(21, 34), match='eshant@gfg.in'>
re.search() detects the email anywhere in the text, while re.match() fails because the email is not at the beginning.
A document or paragraph can contain more than one email address. The findall() function is used to extract all substrings that match a given email pattern, making it suitable for collecting multiple email addresses from text data.
Here we scans the entire text and returns a list of all email addresses that match the regex pattern
Output:
['eshant@gfg.in', 'support@gfg.org']