Why Email Validation Is More Complex Than It Looks

“Validate an email” seems trivial. In practice, it’s a multi-dimensional problem with non-trivial trade-offs between precision, latency, UX, and implementation complexity.

An email address is “valid” according to four distinct criteria:

Correct syntax: conforms to RFC 5322
Reachable domain: has active MX records
Existing mailbox: the server accepts emails for this specific address
Legitimate address: not temporary, disposable, or fraudulent

Most implementations only cover criteria 1 and 2. Criteria 3 and 4 require different approaches.

Method 1: Regex Validation (Syntax)

How It Works

A regular expression checks that the address follows a basic structure: presence of an @, a domain, a TLD. For example:

// Basic regex (acceptable for most cases)
const basicEmailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;

// Strict RFC 5322 compliant regex (often too restrictive)
const rfcEmailRegex = /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/;

What It Detects

✅ Addresses without @ (john.smith.com)
✅ Addresses without domain (john@)
✅ Addresses with spaces (john [email protected])
✅ Obvious invalid characters

What It Doesn’t Detect

❌ Non-existent domains ([email protected])
❌ Disposable addresses ([email protected])
❌ Non-existent mailboxes ([email protected])
❌ Catch-all

Recommended Use Case

Mandatory for any web form, but insufficient alone. Use as a first line of defence before any more advanced verification.

Latency

0 ms (client-side). No network dependency.

Method 2: DNS/MX Verification

How It Works

A DNS query retrieves the domain’s MX records. If no MX record is found, the domain cannot receive emails.

import dns.resolver

def has_valid_mx(domain: str) -> bool:
    try:
        answers = dns.resolver.resolve(domain, 'MX')
        return len(answers) > 0
    except (dns.resolver.NXDOMAIN, dns.resolver.NoAnswer):
        return False

You can go further by also retrieving SPF and DMARC records:

def get_email_security(domain: str) -> dict:
    result = {'has_mx': False, 'has_spf': False, 'has_dmarc': False}

    try:
        result['has_mx'] = bool(dns.resolver.resolve(domain, 'MX'))
    except Exception:
        pass

    try:
        txt_records = dns.resolver.resolve(domain, 'TXT')
        for record in txt_records:
            text = record.to_text()
            if 'v=spf1' in text:
                result['has_spf'] = True
    except Exception:
        pass

    try:
        dmarc = dns.resolver.resolve(f'_dmarc.{domain}', 'TXT')
        result['has_dmarc'] = bool(dmarc)
    except Exception:
        pass

    return result

What It Detects

✅ Non-existent domains (NXDOMAIN)
✅ Domains without MX (cannot receive emails)
✅ Missing SPF/DMARC (risk signal)

What It Doesn’t Detect

❌ Specific non-existent addresses on a valid domain
❌ Disposable addresses (Yopmail has perfectly valid MX servers)
❌ Catch-all (domain has MX, but all addresses accepted)

Recommended Use Case

Very effective for email campaigns and list imports. Executed server-side before/during import. Quickly detects “fake domains” and significantly reduces hard bounces.

Latency

20–200 ms (network request, variable DNS TTL). Acceptable server-side, avoid in real-time client-side validation.

Method 3: SMTP Verification

How It Works

SMTP verification simulates sending an email up to the recipient verification point, without sending an actual message:

import smtplib
import dns.resolver

def verify_smtp(email: str) -> bool:
    domain = email.split('@')[1]

    # 1. Get the MX
    mx_records = dns.resolver.resolve(domain, 'MX')
    mx_host = str(sorted(mx_records, key=lambda r: r.preference)[0].exchange)

    # 2. SMTP connection
    with smtplib.SMTP(timeout=10) as smtp:
        smtp.connect(mx_host)
        smtp.helo('verify.example.com')  # Your sending domain
        smtp.mail('[email protected]')  # Fictitious sending address
        code, message = smtp.rcpt(email)
        # 250 = accepted, 550 = rejected, 421 = temporary
        return code == 250

What It Detects

✅ Non-existent inboxes (if the server responds honestly)
✅ Addresses that have been deactivated (ex-employees)

Important Limitations

❌ Catch-all domains: respond 250 OK to everything, regardless of address
❌ Servers that block verifications: Gmail, Outlook, Yahoo refuse this type of request — you systematically get 252 (“cannot verify but will deliver”)
❌ Greylisting: some servers delay the response for new senders
❌ Honeypot addresses: addresses technically exist but are anti-spam traps
❌ Reputation impact: repeated SMTP requests from your IP can trigger anti-spam defences

Recommended Use Case

Useful for moderate-size B2B databases, after MX verification. Avoid for large databases (> 100,000 addresses) without dedicated infrastructure. Ineffective for Gmail, Yahoo, Outlook addresses.

Latency

500 ms – 10 s depending on the server. Incompatible with real-time client-side validation.

Method 4: Risk Score (API Approach)

How It Works

This approach combines the previous methods with an additional intelligence layer: domain blacklist, DNS fingerprinting, domain age, behavioural patterns.

curl -X GET "https://api.syvel.io/v1/check/[email protected]" \
  -H "Authorization: Bearer sv_your_key"

{
  "email": "a9****[email protected]",
  "is_risky": true,
  "risk_score": 100,
  "reason": "disposable",
  "is_free_provider": false,
  "is_corporate_email": false,
  "did_you_mean": null,
  "is_alias_email": false,
  "mx_provider_label": "Yopmail",
  "deliverability_score": 0
}

What It Detects

✅ Known disposable addresses (blacklist of 500,000+ domains)
✅ New disposable domains (DNS fingerprinting)
✅ Catch-all addresses
✅ Domains without MX
✅ Malformed addresses
✅ Risk patterns (recent domain, missing email auth…)

Limitations

❌ Individually non-existent mailboxes on legitimate domains (Gmail, non-catch-all companies)
❌ Invalid addresses but on healthy domains ([email protected])

Recommended Use Case

Ideal for real-time validation in web forms, signups, checkout. Covers the 4 main threats (format, domain, disposable, risk) in a single call.

Latency

Very low — hosted in France, optimised for real-time validation in forms. Compatible with direct integration without intermediate proxy.

Comparative Summary

Criterion	Regex	DNS/MX	SMTP	API/Score
Invalid syntax	✅	✅	✅	✅
Non-existent domain	❌	✅	✅	✅
Non-existent address	❌	❌	✅*	❌
Disposable email	❌	❌	❌	✅
Catch-all	❌	❌	❌	✅
Latency	0 ms	20–200 ms	500 ms–10 s	Very low (EU)
Client-side	✅	❌	❌	Via proxy
No third-party account	✅	✅	⚠️	❌

*SMTP only works if the server responds honestly (not catch-all, no blocking).

The Right Combination for Your Use Case

Web form (SaaS, e-commerce, newsletter) → Client-side regex + server-side API score at submission

B2B list import → Regex + DNS/MX + batch API score (no SMTP)

Existing database validation (email campaign) → DNS/MX + API score + optional SMTP on B2B segment excluding Gmail/Yahoo

Strict validation (official documents, identity verification) → SMTP + email confirmation (send a verification code) — the only 100% reliable method

For advanced DNS-level detection beyond these four methods, read our article on MX fingerprinting for detecting disposable domains. And for practical Python implementations, check how to validate emails with the Syvel Python API.

Email Validation: Regex, DNS, SMTP — Which Method to Choose?

Why Email Validation Is More Complex Than It Looks

Method 1: Regex Validation (Syntax)

How It Works

What It Detects

What It Doesn’t Detect

Recommended Use Case

Latency

Method 2: DNS/MX Verification

How It Works

What It Detects

What It Doesn’t Detect

Recommended Use Case

Latency

Method 3: SMTP Verification

How It Works

What It Detects

Important Limitations

Recommended Use Case

Latency

Method 4: Risk Score (API Approach)

How It Works

What It Detects

Limitations

Recommended Use Case

Latency

Comparative Summary

The Right Combination for Your Use Case

Protect your forms with Syvel

Related posts