Why Email Validation Is More Complex Than It Looks
“Validate an email” seems trivial. In practice, it’s a multi-dimensional problem with non-trivial trade-offs between precision, latency, UX, and implementation complexity.
An email address is “valid” according to four distinct criteria:
- Correct syntax: conforms to RFC 5322
- Reachable domain: has active MX records
- Existing mailbox: the server accepts emails for this specific address
- Legitimate address: not temporary, disposable, or fraudulent
Most implementations only cover criteria 1 and 2. Criteria 3 and 4 require different approaches.
Method 1: Regex Validation (Syntax)
How It Works
A regular expression checks that the address follows a basic structure: presence of an @, a domain, a TLD. For example:
// Basic regex (acceptable for most cases)const basicEmailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
// Strict RFC 5322 compliant regex (often too restrictive)const rfcEmailRegex = /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/;What It Detects
- ✅ Addresses without
@(john.smith.com) - ✅ Addresses without domain (
john@) - ✅ Addresses with spaces (
john [email protected]) - ✅ Obvious invalid characters
What It Doesn’t Detect
- ❌ Non-existent domains (
[email protected]) - ❌ Disposable addresses (
[email protected]) - ❌ Non-existent mailboxes (
[email protected]) - ❌ Catch-all
Recommended Use Case
Mandatory for any web form, but insufficient alone. Use as a first line of defence before any more advanced verification.
Latency
0 ms (client-side). No network dependency.
Method 2: DNS/MX Verification
How It Works
A DNS query retrieves the domain’s MX records. If no MX record is found, the domain cannot receive emails.
import dns.resolver
def has_valid_mx(domain: str) -> bool: try: answers = dns.resolver.resolve(domain, 'MX') return len(answers) > 0 except (dns.resolver.NXDOMAIN, dns.resolver.NoAnswer): return FalseYou can go further by also retrieving SPF and DMARC records:
def get_email_security(domain: str) -> dict: result = {'has_mx': False, 'has_spf': False, 'has_dmarc': False}
try: result['has_mx'] = bool(dns.resolver.resolve(domain, 'MX')) except Exception: pass
try: txt_records = dns.resolver.resolve(domain, 'TXT') for record in txt_records: text = record.to_text() if 'v=spf1' in text: result['has_spf'] = True except Exception: pass
try: dmarc = dns.resolver.resolve(f'_dmarc.{domain}', 'TXT') result['has_dmarc'] = bool(dmarc) except Exception: pass
return resultWhat It Detects
- ✅ Non-existent domains (NXDOMAIN)
- ✅ Domains without MX (cannot receive emails)
- ✅ Missing SPF/DMARC (risk signal)
What It Doesn’t Detect
- ❌ Specific non-existent addresses on a valid domain
- ❌ Disposable addresses (Yopmail has perfectly valid MX servers)
- ❌ Catch-all (domain has MX, but all addresses accepted)
Recommended Use Case
Very effective for email campaigns and list imports. Executed server-side before/during import. Quickly detects “fake domains” and significantly reduces hard bounces.
Latency
20–200 ms (network request, variable DNS TTL). Acceptable server-side, avoid in real-time client-side validation.
Method 3: SMTP Verification
How It Works
SMTP verification simulates sending an email up to the recipient verification point, without sending an actual message:
import smtplibimport dns.resolver
def verify_smtp(email: str) -> bool: domain = email.split('@')[1]
# 1. Get the MX mx_records = dns.resolver.resolve(domain, 'MX') mx_host = str(sorted(mx_records, key=lambda r: r.preference)[0].exchange)
# 2. SMTP connection with smtplib.SMTP(timeout=10) as smtp: smtp.connect(mx_host) smtp.helo('verify.example.com') # Your sending domain code, message = smtp.rcpt(email) # 250 = accepted, 550 = rejected, 421 = temporary return code == 250What It Detects
- ✅ Non-existent inboxes (if the server responds honestly)
- ✅ Addresses that have been deactivated (ex-employees)
Important Limitations
- ❌ Catch-all domains: respond
250 OKto everything, regardless of address - ❌ Servers that block verifications: Gmail, Outlook, Yahoo refuse this type of request — you systematically get
252(“cannot verify but will deliver”) - ❌ Greylisting: some servers delay the response for new senders
- ❌ Honeypot addresses: addresses technically exist but are anti-spam traps
- ❌ Reputation impact: repeated SMTP requests from your IP can trigger anti-spam defences
Recommended Use Case
Useful for moderate-size B2B databases, after MX verification. Avoid for large databases (> 100,000 addresses) without dedicated infrastructure. Ineffective for Gmail, Yahoo, Outlook addresses.
Latency
500 ms – 10 s depending on the server. Incompatible with real-time client-side validation.
Method 4: Risk Score (API Approach)
How It Works
This approach combines the previous methods with an additional intelligence layer: domain blacklist, DNS fingerprinting, domain age, behavioural patterns.
-H "Authorization: Bearer sv_your_key"{ "is_risky": true, "risk_score": 100, "reason": "disposable", "is_free_provider": false, "is_corporate_email": false, "did_you_mean": null, "is_alias_email": false, "mx_provider_label": "Yopmail", "deliverability_score": 0}What It Detects
- ✅ Known disposable addresses (blacklist of 500,000+ domains)
- ✅ New disposable domains (DNS fingerprinting)
- ✅ Catch-all addresses
- ✅ Domains without MX
- ✅ Malformed addresses
- ✅ Risk patterns (recent domain, missing email auth…)
Limitations
- ❌ Individually non-existent mailboxes on legitimate domains (Gmail, non-catch-all companies)
- ❌ Invalid addresses but on healthy domains (
[email protected])
Recommended Use Case
Ideal for real-time validation in web forms, signups, checkout. Covers the 4 main threats (format, domain, disposable, risk) in a single call.
Latency
Very low — hosted in France, optimised for real-time validation in forms. Compatible with direct integration without intermediate proxy.
Comparative Summary
| Criterion | Regex | DNS/MX | SMTP | API/Score |
|---|---|---|---|---|
| Invalid syntax | ✅ | ✅ | ✅ | ✅ |
| Non-existent domain | ❌ | ✅ | ✅ | ✅ |
| Non-existent address | ❌ | ❌ | ✅* | ❌ |
| Disposable email | ❌ | ❌ | ❌ | ✅ |
| Catch-all | ❌ | ❌ | ❌ | ✅ |
| Latency | 0 ms | 20–200 ms | 500 ms–10 s | Very low (EU) |
| Client-side | ✅ | ❌ | ❌ | Via proxy |
| No third-party account | ✅ | ✅ | ⚠️ | ❌ |
*SMTP only works if the server responds honestly (not catch-all, no blocking).
The Right Combination for Your Use Case
Web form (SaaS, e-commerce, newsletter) → Client-side regex + server-side API score at submission
B2B list import → Regex + DNS/MX + batch API score (no SMTP)
Existing database validation (email campaign) → DNS/MX + API score + optional SMTP on B2B segment excluding Gmail/Yahoo
Strict validation (official documents, identity verification) → SMTP + email confirmation (send a verification code) — the only 100% reliable method
For advanced DNS-level detection beyond these four methods, read our article on MX fingerprinting for detecting disposable domains. And for practical Python implementations, check how to validate emails with the Syvel Python API.