Table of Contents
You've built a beautiful React, Vue, or Angular single-page application. It's fast, modern, and users love it. But when you share links on Facebook, you see generic previews with no images. Google Search Console shows disappointing rankings. Your marketing team is frustrated because social media posts don't convert.
What went wrong?
The answer lies in how single-page applications (SPAs) render content. Unlike traditional server-rendered websites, SPAs send minimal HTML to the browser and rely on JavaScript to build the page dynamically. This creates a critical problem: search engine crawlers and social media bots don't execute JavaScript properly, so they see empty HTML with no content or meta tags.
In this guide, I'll show you how to solve this problem using AWS Lambda@Edge—a serverless solution that detects bots at CloudFront's edge locations and dynamically injects meta tags into HTML responses. This approach is:
- Framework-agnostic: Works with any SPA (React, Vue, Angular, Svelte)
- Cost-effective: ~$1-2/month vs $99-249/month for pre-rendering services
- Low-latency: 8-15ms execution time at the edge
- Production-ready: Complete with infrastructure-as-code templates
- Non-invasive: No changes to your existing SPA code
Complete working example available: All code from this guide is available in the lambda-edge-spa-seo repository. Star it if you find it useful!
What You'll Learn
By the end of this guide, you'll understand:
- Why SPAs fail at SEO and social media sharing
- How Lambda@Edge works and when to use it
- How to implement bot detection and meta tag injection
- Complete infrastructure setup with Terraform
- Testing strategies and cost optimization techniques
Let's dive in.
The SPA Problem: Why Bots See Empty HTML
To understand the solution, we need to understand the problem. Let's examine how SPAs differ from traditional server-rendered applications.
How SPAs Work
In a traditional server-rendered application, when a user requests a page, the server generates complete HTML with all content and meta tags:
<!-- Traditional SSR: Server sends complete HTML -->
<!DOCTYPE html>
<html>
<head>
<title>Awesome Blog Post - My Site</title>
<meta name="description" content="This is an amazing post about...">
<meta property="og:title" content="Awesome Blog Post">
<meta property="og:description" content="This is an amazing post about...">
<meta property="og:image" content="https://example.com/images/post.jpg">
</head>
<body>
<h1>Awesome Blog Post</h1>
<p>This is an amazing post about Lambda@Edge and SEO...</p>
<p>More content here...</p>
</body>
</html>
In contrast, SPAs serve a minimal HTML shell and load content via JavaScript:
<!-- SPA: Server sends minimal HTML shell -->
<!DOCTYPE html>
<html>
<head>
<title>My App</title>
<meta name="description" content="Generic app description">
</head>
<body>
<div id="root"></div>
<script src="/bundle.js"></script>
</body>
</html>
The JavaScript bundle then:
- Executes in the browser
- Fetches data from APIs
- Renders content into the
<div id="root">element - Updates meta tags via
document.titleor libraries like React Helmet
For human users, this works perfectly. The page loads, JavaScript executes, and content appears. But for bots? Not so much.
Why Bots Fail with SPAs
Search engine crawlers and social media bots have significant limitations:
Search Engine Crawlers (Google, Bing, DuckDuckGo):
- May execute JavaScript, but with strict resource constraints
- Rendering happens in a queue, can take hours or days
- Timeout after 5-10 seconds if JavaScript is slow
- Don't support all modern JavaScript features
- May miss content loaded after initial render
Social Media Crawlers (Facebook, Twitter, LinkedIn):
- Do NOT execute JavaScript at all
- Only read server-sent HTML
- Look for meta tags in the
<head>section - Timeout after 2-5 seconds
- Cache results aggressively
When Facebook's scraper (facebookexternalhit) visits your SPA, it sees:
<html>
<head>
<title>My App</title>
<meta name="description" content="Generic app description">
</head>
<body>
<div id="root"></div>
<script src="/bundle.js"></script>
</body>
</html>
No og:title, no og:image, no content. Result: broken social preview cards.
Real-World Impact
Here's what happens in practice:
Facebook Sharing:
Generic Preview:
┌─────────────────────────────┐
│ My App │
│ example.com │
│ Generic app description │
└─────────────────────────────┘
What You Want:
Rich Preview:
┌─────────────────────────────┐
│ [Beautiful Hero Image] │
│ Awesome Blog Post │
│ This is an amazing post... │
│ example.com/blog/post │
└─────────────────────────────┘
Google Search Results:
- Wrong page title (shows generic "My App" instead of specific post title)
- Generic description instead of post-specific content
- Delayed or missing indexing
- Lower rankings due to poor content signals
Client-Side Meta Tag Updates Don't Work
Many developers try to fix this with client-side meta tag manipulation:
// React example - DOESN'T WORK FOR BOTS
import { Helmet } from 'react-helmet';
function BlogPost({ post }) {
return (
<>
<Helmet>
<title>{post.title} - My Blog</title>
<meta property="og:title" content={post.title} />
<meta property="og:description" content={post.description} />
<meta property="og:image" content={post.image} />
</Helmet>
<article>
<h1>{post.title}</h1>
<p>{post.content}</p>
</article>
</>
);
}
This updates meta tags after JavaScript executes. Bots never execute JavaScript, so they never see these tags.
Traditional Solutions & Limitations
Before Lambda@Edge, you had limited options:
1. Server-Side Rendering (SSR)
- Frameworks: Next.js, Nuxt.js, SvelteKit, Angular Universal
- Pros: Best solution, proper SSR for all requests
- Cons: Requires complete rewrite, high infrastructure costs
2. Pre-rendering Services
- Services: Prerender.io, Rendertron, Puppeteer-based solutions
- Pros: Minimal code changes, works with existing SPAs
- Cons: Expensive ($99-500/month), adds latency, single point of failure
3. Static Site Generation
- Tools: Gatsby, Hugo, Jekyll, 11ty
- Pros: Perfect SEO, fast, cheap hosting
- Cons: Not suitable for dynamic content, build time increases with content
4. Hybrid Approaches
- Example: Pre-render critical pages, SPA for everything else
- Pros: Balance between SEO and SPA benefits
- Cons: Complex to maintain, inconsistent user experience
All these approaches share common problems:
- High cost (time or money)
- Complexity in implementation
- Significant architectural changes required
Understanding Lambda@Edge
Lambda@Edge offers an elegant alternative: intercept bot requests at CloudFront's edge and inject meta tags dynamically, without changing your SPA architecture.
What is Lambda@Edge?
Lambda@Edge is AWS Lambda's extension that runs serverless functions at CloudFront edge locations—140+ locations worldwide, close to your users.
Key characteristics:
- Executes JavaScript (Node.js) or Python code
- Runs at CloudFront edge locations (not centralized)
- Intercepts CloudFront requests and responses
- No server management required
- Pay per execution (no idle costs)
CloudFront Event Types
Lambda@Edge can execute at four points in the CloudFront request/response lifecycle:
┌─────────────┐ Viewer ┌──────────────┐ Origin ┌─────────────┐
│ Client │────Request────>│ CloudFront │────Request────>│ Origin │
│ (Browser) │ │ Edge │ │ (S3/Server) │
│ │<───Response────│ │<───Response────│ │
└─────────────┘ └──────────────┘ └─────────────┘
▲ ▲ ▲ ▲
│ │ │ │
[1] Viewer │ │ [4] Viewer [2] Origin │ │ [3] Origin
Request │ └─ Response Request │ └─ Response
Event Triggers:
Viewer Request (before CloudFront forwards to origin)
- Use cases: Authentication, A/B testing, URI rewriting
- Timeout: 5 seconds
- Memory: 128MB (fixed)
Origin Request (before CloudFront sends to origin, on cache miss)
- Use cases: Adding headers, custom origin logic, external API calls
- Timeout: 30 seconds
- Memory: 128MB
Origin Response (after receiving from origin, before caching)
- Use cases: Modifying headers before caching, response transformation
- Timeout: 30 seconds
- Memory: 128MB
Viewer Response (before returning to client)
- Use cases: Meta tag injection ← Our approach
- Timeout: 5 seconds
- Memory: 128MB (fixed)
Why Viewer-Response for Meta Tag Injection?
For our use case (meta tag injection), viewer-response is optimal:
| Consideration | Viewer-Response | Origin-Response |
|---|---|---|
| Execution frequency | Every request (cache hit + miss) | Cache misses only |
| Cache impact | Modifies after cache lookup | Modifies before caching |
| Latency | Lower (after cache) | Higher (on miss only) |
| Flexibility | Can modify per request | Same for all cached |
| Our use case | ✅ Perfect | ⚠️ Less suitable |
Why viewer-response wins:
- Executes on every request (both cache hits and misses)
- Can serve different HTML based on User-Agent (bot vs human)
- Leverages CloudFront caching (98%+ cache hit ratio)
- Minimal latency since it runs after cache lookup
Lambda@Edge Constraints
Understanding the limits is critical for production implementations:
// Viewer Request/Response Constraints
const LIMITS = {
memory: "128MB (fixed, cannot increase)",
timeout: "5 seconds max",
codeSize: "1MB max (including dependencies)",
responseSize: "1MB max body size",
requestSize: "40KB max headers + body",
// Important limitations
noFileSystem: true, // No /tmp access
noDynamoDB: "Not recommended", // High latency from edge
noExternalAPIs: "Not recommended", // Must complete in 5s
};
Practical implications:
- ✅ DO: Use regex for HTML parsing (fast, lightweight)
- ✅ DO: Embed metadata in function code (no external calls)
- ✅ DO: Use simple string operations
- ❌ DON'T: Use heavy DOM parsers (cheerio, jsdom)
- ❌ DON'T: Call external APIs in viewer events
- ❌ DON'T: Use large npm packages
CloudWatch Logs Location: Unlike regular Lambda, Lambda@Edge logs appear in the edge region where the function executed, not us-east-1. This means logs are distributed globally.
# Logs appear in edge regions
/aws/lambda/us-east-1.function-name # US East users
/aws/lambda/eu-west-1.function-name # Europe users
/aws/lambda/ap-southeast-1.function-name # Asia users
Solution Architecture
Now let's design the complete solution. Here's how Lambda@Edge solves the SPA SEO/OGP problem.
High-Level Architecture
┌──────────┐
│ Client │ (Human User or Bot)
└────┬─────┘
│
│ 1. GET /blog/my-post
│ User-Agent: facebookexternalhit/1.1
▼
┌────────────────────────────────────────┐
│ CloudFront Distribution │
│ (CDN + Lambda@Edge) │
│ │
│ ┌──────────────────────────────────┐ │
│ │ Cache Layer │ │
│ │ - Separate cache per User-Agent│ │
│ │ - 98%+ cache hit ratio │ │
│ └──────────────────────────────────┘ │
└────┬───────────────────────────────────┘
│
│ 2. Trigger: Viewer-Response
▼
┌──────────────────────────────────────────────────┐
│ Lambda@Edge Function │
│ ┌────────────────────────────────────────────┐ │
│ │ Step 1: Bot Detection │ │
│ │ - Extract User-Agent header │ │
│ │ - Match: /googlebot|facebookexternalhit/ │ │
│ │ - Result: isBot = true/false │ │
│ │ │ │
│ │ Step 2: If Bot Detected │ │
│ │ - Parse request URI: /blog/my-post │ │
│ │ - Lookup metadata from embedded config │ │
│ │ - Generate OGP meta tags │ │
│ │ - Inject into HTML <head> │ │
│ │ - Return modified HTML │ │
│ │ │ │
│ │ Step 3: If Human Detected │ │
│ │ - Return original HTML (SPA shell) │ │
│ │ - JavaScript hydrates normally │ │
│ └────────────────────────────────────────────┘ │
└────┬─────────────────────────────────────────────┘
│
│ 3. Return HTML (modified or original)
▼
┌──────────┐
│ Client │
│ - Bot: Sees meta tags in HTML <head>
│ - Human: Sees SPA shell, JS hydrates
└──────────┘
Component Architecture
1. S3 Bucket (Static Hosting)
my-spa-bucket/
├── index.html # SPA shell (entry point)
├── static/
│ ├── js/
│ │ ├── main.chunk.js
│ │ └── vendor.chunk.js
│ ├── css/
│ │ └── main.css
│ └── media/
│ └── logo.png
└── assets/
└── og-images/ # Open Graph images
├── default.jpg
└── blog-post.jpg
2. CloudFront Distribution
- Origin: S3 bucket with Origin Access Identity (OAI)
- Cache behavior: Forward User-Agent header to Lambda@Edge
- Vary header: Cache separately based on User-Agent pattern
- Error handling: 404 → 200 /index.html (SPA routing)
3. Lambda@Edge Function (Node.js 20)
// Function structure (simplified)
exports.handler = async (event) => {
const request = event.Records[0].cf.request;
const response = event.Records[0].cf.response;
// 1. Bot detection
const userAgent = request.headers['user-agent'][0].value;
const isBot = detectBot(userAgent);
// 2. Return early if not bot
if (!isBot) return response;
// 3. Inject meta tags for bots
const metadata = getMetadata(request.uri);
const modifiedResponse = injectMetaTags(response, metadata);
return modifiedResponse;
};
4. Metadata Configuration
// Embedded in Lambda function (zero latency)
const METADATA_MAP = {
'/': {
title: 'My SPA - Home',
description: 'Welcome to my application',
image: 'https://cdn.example.com/og-home.jpg',
url: 'https://example.com/'
},
'/blog/lambda-edge': {
title: 'Solving SPA SEO with Lambda@Edge',
description: 'Learn how to fix SPA SEO issues using AWS Lambda@Edge',
image: 'https://cdn.example.com/blog/lambda-edge.jpg',
url: 'https://example.com/blog/lambda-edge'
}
};
Request Flow: Bot vs Human
Bot Request Flow (Facebook scraper):
1. Facebook bot requests: /blog/my-post
User-Agent: facebookexternalhit/1.1
2. CloudFront receives request
- Checks cache (key: URL + User-Agent pattern)
- Cache MISS (first time)
3. CloudFront fetches from S3
- Gets index.html (SPA shell)
4. Triggers viewer-response Lambda@Edge
- Detects bot: isBot = true
- Looks up metadata for /blog/my-post
- Injects OGP tags into HTML <head>
- Returns modified HTML
5. CloudFront caches modified response
- Cache key includes User-Agent pattern
6. Facebook receives HTML with meta tags:
<meta property="og:title" content="My Post" />
<meta property="og:image" content="image.jpg" />
7. Facebook scraper parses meta tags
- Extracts title, description, image
- Generates rich preview card
Human Request Flow (Chrome browser):
1. User requests: /blog/my-post
User-Agent: Mozilla/5.0 ... Chrome/120.0
2. CloudFront receives request
- Checks cache (key: URL + User-Agent pattern)
- Cache HIT (98% of the time)
3. Triggers viewer-response Lambda@Edge
- Detects human: isBot = false
- Returns original HTML (no modification)
4. User receives SPA shell:
<div id="root"></div>
<script src="/bundle.js"></script>
5. JavaScript executes
- Fetches post data from API
- Renders content
- Updates meta tags client-side (for display only)
Caching Strategy with Vary Header
CloudFront caches responses based on cache keys. We need separate cache entries for bots vs humans:
// CloudFront cache behavior configuration
{
"ForwardedValues": {
"QueryString": false,
"Headers": ["User-Agent"], // Forward to Lambda@Edge
"Cookies": { "Forward": "none" }
},
"MinTTL": 3600, // 1 hour min
"DefaultTTL": 86400, // 24 hours default
"MaxTTL": 604800 // 7 days max
}
Cache Key Composition:
Cache Key = URL + User-Agent Pattern
Example cache entries:
1. /blog/post + "bot" → HTML with injected meta tags
2. /blog/post + "browser" → Original SPA shell
Result: Bots get rich HTML, humans get SPA shell
Cache Hit Ratio:
- Initial bot request: MISS (Lambda@Edge executes)
- Subsequent bot requests: HIT (served from cache)
- Human requests: HIT (separate cache entry)
- Expected cache hit ratio: 98%+
This means Lambda@Edge executes on only ~2% of requests, dramatically reducing costs.
Performance Characteristics
Latency Breakdown:
Bot Request (First Time - Cache Miss):
┌──────────────────────────────────────┐
│ CloudFront routing: ~5ms │
│ S3 fetch (origin): ~50ms │
│ Lambda@Edge execution: ~10ms │
│ - Bot detection: 1ms │
│ - Metadata lookup: 0.1ms │
│ - HTML injection: 8ms │
│ CloudFront caching: ~2ms │
│ ───────────────────────────────── │
│ Total: ~67ms │
└──────────────────────────────────────┘
Bot Request (Cached):
┌──────────────────────────────────────┐
│ CloudFront routing: ~5ms │
│ Cache hit (no origin): 0ms │
│ Lambda@Edge execution: ~10ms │
│ ───────────────────────────────── │
│ Total: ~15ms │
└──────────────────────────────────────┘
Human Request (Cached):
┌──────────────────────────────────────┐
│ CloudFront routing: ~5ms │
│ Cache hit (no origin): 0ms │
│ Lambda@Edge (no injection): ~2ms │
│ ───────────────────────────────── │
│ Total: ~7ms │
└──────────────────────────────────────┘
Key Metrics:
- Bot meta tag injection: 10-15ms added latency
- Human requests: 2-7ms added latency (negligible)
- Cache hit ratio: 98%+ with proper configuration
- Lambda@Edge cold start: 20-50ms (rare, 1-2% of requests)
Cost Structure
For a site with 1 million requests/month:
Traffic Distribution:
- Total requests: 1,000,000
- Bot traffic: 5% (50,000 requests)
- Human traffic: 95% (950,000 requests)
- Cache hit ratio: 98%
Lambda@Edge Execution:
- Cache misses: 2% of 1M = 20,000 executions
- Average execution: 10ms per request
- Memory: 128MB (fixed)
Cost Breakdown:
┌─────────────────────────────────────────────┐
│ CloudFront │
│ - Requests: 1M × $0.0075/10k = $0.75 │
│ - Data transfer: ~1GB × $0.085 = $0.09 │
│ │
│ Lambda@Edge │
│ - Requests: 20k × $0.60/1M = $0.01 │
│ - Compute: 20k × 10ms × 128MB │
│ = 25.6 GB-s × $0.00000625125 = $0.16 │
│ │
│ Total Monthly Cost: $1.01 │
└─────────────────────────────────────────────┘
Comparison:
- Lambda@Edge solution: $1.01/month
- Prerender.io Basic: $99/month
- Prerender.io Pro: $249/month
- Savings: 99% vs Prerender.io
This architecture delivers production-grade SEO and OGP support for SPAs at minimal cost and latency.
Implementation: Bot Detection
Now let's implement the core logic. Bot detection is the critical first step—we need to identify when a request comes from a search crawler or social media bot.
Bot User-Agent Patterns
Search engines and social platforms identify their crawlers using User-Agent strings. Here are the key patterns:
// Comprehensive bot detection patterns (2026)
const BOT_USER_AGENTS = [
// Google crawlers
/googlebot/i, // Main Googlebot
/google-inspectiontool/i, // Google Search Console
/adsbot-google/i, // Google Ads bot
// Other search engines
/bingbot/i, // Microsoft Bing
/slurp/i, // Yahoo
/duckduckbot/i, // DuckDuckGo
/baiduspider/i, // Baidu (China)
/yandexbot/i, // Yandex (Russia)
/sogou/i, // Sogou (China)
// Social media crawlers
/facebookexternalhit/i, // Facebook Link Preview
/facebot/i, // Facebook Bot (alternative)
/twitterbot/i, // Twitter Card Validator
/linkedinbot/i, // LinkedIn Preview
/slackbot/i, // Slack Link Unfurling
/discordbot/i, // Discord Link Embed
/whatsapp/i, // WhatsApp Link Preview
/telegrambot/i, // Telegram Link Preview
// Additional important bots
/pinterestbot/i, // Pinterest
/redditbot/i, // Reddit
/applebot/i, // Apple (Siri, Spotlight)
/ia_archiver/i, // Alexa
];
/**
* Detect if the request comes from a bot
* @param {string} userAgent - User-Agent header value
* @returns {boolean} - True if bot detected
*/
function isBot(userAgent) {
if (!userAgent) return false;
return BOT_USER_AGENTS.some(pattern => pattern.test(userAgent));
}
Detection Strategy
Our detection strategy prioritizes simplicity and reliability:
1. User-Agent Matching (Primary)
- Fast: Regex test completes in < 1ms
- Reliable: Bots consistently use identifiable User-Agent strings
- Comprehensive: Covers 99%+ of legitimate crawlers
- Easy to extend: Add new patterns as bots emerge
2. CloudFront Headers (Optional) CloudFront can add device-detection headers:
// Optional: Use CloudFront device detection headers
const headers = request.headers;
const isMobile = headers['cloudfront-is-mobile-viewer']?.[0]?.value === 'true';
const isDesktop = headers['cloudfront-is-desktop-viewer']?.[0]?.value === 'true';
// CloudFront doesn't provide is-bot header by default
// but User-Agent matching is sufficient
Production Implementation
Here's the complete bot detection logic with proper error handling:
/**
* Lambda@Edge handler for bot detection and meta tag injection
*/
exports.handler = async (event) => {
const { request, response } = event.Records[0].cf;
try {
// Extract User-Agent header
const userAgentHeader = request.headers['user-agent'];
const userAgent = userAgentHeader?.[0]?.value || '';
// Perform bot detection
const isBotRequest = isBot(userAgent);
// Log detection result (CloudWatch)
console.log(JSON.stringify({
timestamp: new Date().toISOString(),
path: request.uri,
userAgent: userAgent.substring(0, 100), // Truncate for logs
isBot: isBotRequest,
country: request.headers['cloudfront-viewer-country']?.[0]?.value
}));
// If not a bot, return original response immediately
if (!isBotRequest) {
return response;
}
// Bot detected: proceed to meta tag injection
// (next section)
} catch (error) {
console.error('Bot detection error:', error);
// Always return response on error (fail gracefully)
return response;
}
};
/**
* Bot detection function
*/
function isBot(userAgent) {
if (!userAgent || typeof userAgent !== 'string') {
return false;
}
return BOT_USER_AGENTS.some(pattern => {
try {
return pattern.test(userAgent);
} catch (err) {
console.error('Regex test error:', err);
return false;
}
});
}
// Bot patterns (from above)
const BOT_USER_AGENTS = [
/googlebot|bingbot|slurp|duckduckbot|baiduspider|yandexbot/i,
/facebookexternalhit|facebot|twitterbot|linkedinbot|slackbot/i,
/discordbot|whatsapp|telegrambot|pinterestbot|redditbot|applebot/i
];
Handling Edge Cases
Missing User-Agent:
// Some requests may not have User-Agent header
const userAgent = request.headers['user-agent']?.[0]?.value || '';
if (!userAgent) {
return false; // Treat as non-bot (default behavior)
}
Malformed User-Agent:
// Wrap regex matching in try-catch
try {
return BOT_USER_AGENTS.some(pattern => pattern.test(userAgent));
} catch (error) {
console.error('Bot detection regex error:', error);
return false; // Fail gracefully, assume non-bot
}
Whitelist for Internal Tools:
// Whitelist certain UAs that match bot patterns but aren't bots
const WHITELIST = [
/my-company-monitoring-tool/i,
/internal-link-checker/i
];
function isBot(userAgent) {
// Check whitelist first
if (WHITELIST.some(pattern => pattern.test(userAgent))) {
return false;
}
// Then check bot patterns
return BOT_USER_AGENTS.some(pattern => pattern.test(userAgent));
}
Testing Bot Detection
Local Testing with Sample Events:
// test-bot-detection.js
const { handler } = require('./index');
const testCases = [
{
name: 'Facebook Bot',
userAgent: 'facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)',
expectedBot: true
},
{
name: 'Googlebot',
userAgent: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)',
expectedBot: true
},
{
name: 'Chrome Browser',
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/120.0.0.0',
expectedBot: false
},
{
name: 'Twitter Bot',
userAgent: 'Twitterbot/1.0',
expectedBot: true
}
];
testCases.forEach(test => {
const event = createMockEvent(test.userAgent);
const result = isBot(test.userAgent);
const pass = result === test.expectedBot;
console.log(`${pass ? '✓' : '✗'} ${test.name}: ${result} (expected ${test.expectedBot})`);
});
cURL Testing (after deployment):
# Test with Facebook bot
curl -H "User-Agent: facebookexternalhit/1.1" \
https://your-cloudfront-domain.com/blog/post \
-v | grep -i "og:title"
# Test with Googlebot
curl -H "User-Agent: Googlebot/2.1" \
https://your-cloudfront-domain.com/blog/post \
-v | grep -i "og:title"
# Test with normal browser (should NOT inject)
curl -H "User-Agent: Mozilla/5.0 Chrome/120.0" \
https://your-cloudfront-domain.com/blog/post \
-v | grep -i "og:title"
Performance Optimization
Pre-compile Regex Patterns:
// Compile patterns once at Lambda initialization (cold start)
// Not on every request (warm execution)
const COMBINED_BOT_PATTERN = new RegExp(
'googlebot|bingbot|slurp|duckduckbot|baiduspider|yandexbot|' +
'facebookexternalhit|facebot|twitterbot|linkedinbot|slackbot|' +
'discordbot|whatsapp|telegrambot|pinterestbot|redditbot|applebot',
'i'
);
function isBot(userAgent) {
return userAgent && COMBINED_BOT_PATTERN.test(userAgent);
}
This single regex test is slightly faster than testing multiple patterns, but the difference is negligible (< 0.5ms).
Monitoring Bot Traffic
CloudWatch Logs Insights Query:
fields @timestamp, userAgent, path, isBot
| filter isBot = true
| stats count() by userAgent
| sort count desc
| limit 20
This query shows which bots are accessing your site and how frequently.
Implementation: Meta Tag Injection
Once we've detected a bot, we need to inject Open Graph Protocol (OGP) meta tags into the HTML response. This section covers the complete implementation.
Metadata Configuration
First, define metadata for your routes. We'll embed this in the Lambda function for zero-latency lookups:
/**
* Metadata map: URL path → OGP metadata
*/
const METADATA_MAP = {
'/': {
title: 'My SPA - Home',
description: 'Welcome to my modern single-page application',
image: 'https://cdn.example.com/images/og-home.jpg',
url: 'https://example.com/',
type: 'website'
},
'/blog/lambda-edge-seo': {
title: 'Solving SPA SEO with Lambda@Edge - My Blog',
description: 'Learn how to fix SPA SEO and social sharing issues using AWS Lambda@Edge. Complete production guide with code examples.',
image: 'https://cdn.example.com/images/blog/lambda-edge.jpg',
url: 'https://example.com/blog/lambda-edge-seo',
type: 'article'
},
'/about': {
title: 'About Us - My SPA',
description: 'Learn more about our company, mission, and team',
image: 'https://cdn.example.com/images/og-about.jpg',
url: 'https://example.com/about',
type: 'website'
}
};
/**
* Default metadata for unknown routes
*/
const DEFAULT_METADATA = {
title: 'My SPA Site',
description: 'A modern single-page application',
image: 'https://cdn.example.com/images/og-default.jpg',
url: 'https://example.com',
type: 'website',
siteName: 'My SPA'
};
/**
* Get metadata for a given path
* @param {string} path - Request URI path
* @returns {Object} - Metadata object
*/
function getMetadata(path) {
// Direct match
if (METADATA_MAP[path]) {
return { ...DEFAULT_METADATA, ...METADATA_MAP[path] };
}
// Pattern matching for dynamic routes
if (path.startsWith('/blog/')) {
const slug = path.split('/blog/')[1];
return {
...DEFAULT_METADATA,
title: `${formatTitle(slug)} - My Blog`,
description: `Read our post about ${formatTitle(slug)}`,
image: `https://cdn.example.com/images/blog/${slug}.jpg`,
url: `https://example.com${path}`,
type: 'article'
};
}
// Fallback to default
return DEFAULT_METADATA;
}
/**
* Format slug into title case
* @param {string} slug - URL slug (e.g., "my-post-title")
* @returns {string} - Title case string (e.g., "My Post Title")
*/
function formatTitle(slug) {
return slug
.split('-')
.map(word => word.charAt(0).toUpperCase() + word.slice(1))
.join(' ');
}
Generating Meta Tags
Create properly formatted OGP meta tags with HTML escaping:
/**
* Generate OGP meta tags HTML
* @param {Object} metadata - Metadata object
* @returns {string} - HTML meta tags
*/
function generateMetaTags(metadata) {
const {
title,
description,
image,
url,
type = 'website',
siteName = 'My SPA'
} = metadata;
// Escape HTML special characters
const escape = (str) => {
if (!str) return '';
return String(str)
.replace(/&/g, '&amp;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;')
.replace(/"/g, '&quot;')
.replace(/'/g, '&#039;');
};
// Generate meta tags
return `
<!-- Open Graph / Facebook -->
<meta property="og:type" content="${escape(type)}">
<meta property="og:url" content="${escape(url)}">
<meta property="og:title" content="${escape(title)}">
<meta property="og:description" content="${escape(description)}">
<meta property="og:image" content="${escape(image)}">
<meta property="og:site_name" content="${escape(siteName)}">
<!-- Twitter Card -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:url" content="${escape(url)}">
<meta name="twitter:title" content="${escape(title)}">
<meta name="twitter:description" content="${escape(description)}">
<meta name="twitter:image" content="${escape(image)}">
<!-- Standard Meta Tags -->
<meta name="description" content="${escape(description)}">
<title>${escape(title)}</title>
`.trim();
}
HTML Injection Strategy
Inject meta tags into the HTML <head> section using regex:
/**
* Inject meta tags into HTML response
* @param {Object} response - CloudFront response object
* @param {Object} metadata - Metadata object
* @returns {Object} - Modified response object
*/
function injectMetaTags(response, metadata) {
// Check if response is HTML
const contentType = response.headers['content-type']?.[0]?.value || '';
if (!contentType.includes('text/html')) {
return response; // Not HTML, return unchanged
}
try {
// Get response body
let html = response.body;
// Handle base64 encoding (CloudFront may encode)
const isBase64 = response.bodyEncoding === 'base64';
if (isBase64) {
html = Buffer.from(html, 'base64').toString('utf8');
}
// Generate meta tags
const metaTags = generateMetaTags(metadata);
// Find </head> tag and inject before it
const headCloseIndex = html.indexOf('</head>');
if (headCloseIndex === -1) {
console.warn('No </head> tag found in HTML');
return response; // Malformed HTML, return unchanged
}
// Inject meta tags
const modifiedHtml =
html.slice(0, headCloseIndex) +
metaTags + '\n' +
html.slice(headCloseIndex);
// Update response body
if (isBase64) {
response.body = Buffer.from(modifiedHtml).toString('base64');
response.bodyEncoding = 'base64';
} else {
response.body = modifiedHtml;
}
// Update Content-Length header
const byteLength = Buffer.byteLength(modifiedHtml, 'utf8');
response.headers['content-length'] = [{
key: 'Content-Length',
value: byteLength.toString()
}];
return response;
} catch (error) {
console.error('Meta tag injection error:', error);
return response; // Return original on error
}
}
Complete Lambda@Edge Function
Here's the full production-ready implementation:
'use strict';
// Bot detection patterns
const BOT_USER_AGENTS = [
/googlebot|bingbot|slurp|duckduckbot|baiduspider|yandexbot/i,
/facebookexternalhit|facebot|twitterbot|linkedinbot|slackbot/i,
/discordbot|whatsapp|telegrambot|pinterestbot|redditbot|applebot/i
];
// Metadata configuration
const METADATA_MAP = {
'/': {
title: 'My SPA - Home',
description: 'Welcome to my modern single-page application',
image: 'https://cdn.example.com/images/og-home.jpg',
url: 'https://example.com/'
},
'/blog/lambda-edge-seo': {
title: 'Solving SPA SEO with Lambda@Edge',
description: 'Learn how to fix SPA SEO using AWS Lambda@Edge',
image: 'https://cdn.example.com/images/blog/lambda-edge.jpg',
url: 'https://example.com/blog/lambda-edge-seo',
type: 'article'
}
};
const DEFAULT_METADATA = {
title: 'My SPA Site',
description: 'A modern single-page application',
image: 'https://cdn.example.com/images/og-default.jpg',
url: 'https://example.com',
type: 'website',
siteName: 'My SPA'
};
/**
* Lambda@Edge handler
*/
exports.handler = async (event) => {
const { request, response } = event.Records[0].cf;
try {
// Extract User-Agent
const userAgentHeader = request.headers['user-agent'];
const userAgent = userAgentHeader?.[0]?.value || '';
// Detect bot
const isBotRequest = isBot(userAgent);
// Log request
console.log(JSON.stringify({
timestamp: Date.now(),
path: request.uri,
userAgent: userAgent.substring(0, 100),
isBot: isBotRequest
}));
// Return original response if not bot
if (!isBotRequest) {
return response;
}
// Get metadata for this path
const metadata = getMetadata(request.uri);
// Inject meta tags
return injectMetaTags(response, metadata);
} catch (error) {
console.error('Lambda@Edge error:', error);
return response; // Always return response on error
}
};
/**
* Bot detection
*/
function isBot(userAgent) {
if (!userAgent || typeof userAgent !== 'string') {
return false;
}
return BOT_USER_AGENTS.some(pattern => pattern.test(userAgent));
}
/**
* Get metadata for path
*/
function getMetadata(path) {
if (METADATA_MAP[path]) {
return { ...DEFAULT_METADATA, ...METADATA_MAP[path] };
}
// Dynamic route handling
if (path.startsWith('/blog/')) {
const slug = path.split('/blog/')[1];
return {
...DEFAULT_METADATA,
title: `${formatTitle(slug)} - My Blog`,
description: `Read about ${formatTitle(slug)}`,
image: `https://cdn.example.com/images/blog/${slug}.jpg`,
url: `https://example.com${path}`,
type: 'article'
};
}
return DEFAULT_METADATA;
}
/**
* Format slug to title case
*/
function formatTitle(slug) {
return slug
.split('-')
.map(w => w.charAt(0).toUpperCase() + w.slice(1))
.join(' ');
}
/**
* Generate meta tags HTML
*/
function generateMetaTags(metadata) {
const escape = (str) => {
if (!str) return '';
return String(str)
.replace(/&/g, '&amp;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;')
.replace(/"/g, '&quot;')
.replace(/'/g, '&#039;');
};
const {
title,
description,
image,
url,
type = 'website',
siteName = 'My SPA'
} = metadata;
return `
<!-- Open Graph / Facebook -->
<meta property="og:type" content="${escape(type)}">
<meta property="og:url" content="${escape(url)}">
<meta property="og:title" content="${escape(title)}">
<meta property="og:description" content="${escape(description)}">
<meta property="og:image" content="${escape(image)}">
<meta property="og:site_name" content="${escape(siteName)}">
<!-- Twitter Card -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:url" content="${escape(url)}">
<meta name="twitter:title" content="${escape(title)}">
<meta name="twitter:description" content="${escape(description)}">
<meta name="twitter:image" content="${escape(image)}">
<!-- Standard Meta Tags -->
<meta name="description" content="${escape(description)}">
<title>${escape(title)}</title>
`.trim();
}
/**
* Inject meta tags into HTML
*/
function injectMetaTags(response, metadata) {
const contentType = response.headers['content-type']?.[0]?.value || '';
if (!contentType.includes('text/html')) {
return response;
}
try {
let html = response.body;
const isBase64 = response.bodyEncoding === 'base64';
if (isBase64) {
html = Buffer.from(html, 'base64').toString('utf8');
}
const metaTags = generateMetaTags(metadata);
const headCloseIndex = html.indexOf('</head>');
if (headCloseIndex === -1) {
return response;
}
const modifiedHtml =
html.slice(0, headCloseIndex) +
metaTags + '\n' +
html.slice(headCloseIndex);
if (isBase64) {
response.body = Buffer.from(modifiedHtml).toString('base64');
response.bodyEncoding = 'base64';
} else {
response.body = modifiedHtml;
}
const byteLength = Buffer.byteLength(modifiedHtml, 'utf8');
response.headers['content-length'] = [{
key: 'Content-Length',
value: byteLength.toString()
}];
return response;
} catch (error) {
console.error('Injection error:', error);
return response;
}
}
OGP Image Requirements
For optimal social sharing, follow these image specifications:
// Facebook OGP image best practices (2026)
const OG_IMAGE_SPECS = {
// Recommended dimensions
recommended: { width: 1200, height: 630 },
aspectRatio: '1.91:1',
// Minimum dimensions
minimum: { width: 600, height: 315 },
// Maximum size
maxFileSize: '8 MB',
// Supported formats
formats: ['JPG', 'PNG', 'WebP', 'GIF'],
// Protocol
protocol: 'https://', // Required for Facebook
// Multiple images
fallback: 'First og:image tag is primary'
};
// Example with multiple images
const metaTags = `
<meta property="og:image" content="https://cdn.example.com/primary.jpg">
<meta property="og:image:secure_url" content="https://cdn.example.com/primary.jpg">
<meta property="og:image:width" content="1200">
<meta property="og:image:height" content="630">
<meta property="og:image:alt" content="Image description for accessibility">
`;
This completes the meta tag injection implementation. Next, we'll deploy this infrastructure using Terraform.
Infrastructure as Code with Terraform
Now let's deploy the complete infrastructure using Terraform. This IaC approach ensures reproducible, version-controlled deployments.
Want to jump straight to deployment? Clone the repository and follow the README:
git clone https://github.com/khuongdo/lambda-edge-spa-seo.git
cd lambda-edge-spa-seo
Prerequisites
# Install Terraform (if not already installed)
brew install terraform # macOS
# or download from https://www.terraform.io/downloads
# Configure AWS credentials
aws configure
# Enter your AWS Access Key ID, Secret Access Key, and region
# Verify setup
terraform --version
aws sts get-caller-identity
Project Structure
spa-seo-lambda-edge/
├── terraform/
│ ├── main.tf # Main infrastructure
│ ├── variables.tf # Input variables
│ ├── outputs.tf # Output values
│ └── lambda.tf # Lambda function config
├── lambda/
│ └── index.js # Lambda@Edge function
└── README.md
Lambda Function Packaging
First, save the Lambda function code:
# Create Lambda function file
mkdir -p lambda
cat > lambda/index.js << 'EOF'
'use strict';
const BOT_USER_AGENTS = [
/googlebot|bingbot|slurp|duckduckbot|baiduspider|yandexbot/i,
/facebookexternalhit|facebot|twitterbot|linkedinbot|slackbot/i,
/discordbot|whatsapp|telegrambot|pinterestbot|redditbot|applebot/i
];
const METADATA_MAP = {
'/': {
title: 'My SPA - Home',
description: 'Welcome to my modern single-page application',
image: 'https://cdn.example.com/images/og-home.jpg',
url: 'https://example.com/'
},
'/blog/lambda-edge-seo': {
title: 'Solving SPA SEO with Lambda@Edge',
description: 'Learn how to fix SPA SEO using AWS Lambda@Edge',
image: 'https://cdn.example.com/images/blog/lambda-edge.jpg',
url: 'https://example.com/blog/lambda-edge-seo',
type: 'article'
}
};
const DEFAULT_METADATA = {
title: 'My SPA Site',
description: 'A modern single-page application',
image: 'https://cdn.example.com/images/og-default.jpg',
url: 'https://example.com',
type: 'website',
siteName: 'My SPA'
};
exports.handler = async (event) => {
const { request, response } = event.Records[0].cf;
try {
const userAgent = request.headers['user-agent']?.[0]?.value || '';
const isBotRequest = isBot(userAgent);
if (!isBotRequest) {
return response;
}
const metadata = getMetadata(request.uri);
return injectMetaTags(response, metadata);
} catch (error) {
console.error('Lambda@Edge error:', error);
return response;
}
};
function isBot(userAgent) {
if (!userAgent) return false;
return BOT_USER_AGENTS.some(pattern => pattern.test(userAgent));
}
function getMetadata(path) {
if (METADATA_MAP[path]) {
return { ...DEFAULT_METADATA, ...METADATA_MAP[path] };
}
if (path.startsWith('/blog/')) {
const slug = path.split('/blog/')[1];
return {
...DEFAULT_METADATA,
title: `${formatTitle(slug)} - My Blog`,
description: `Read about ${formatTitle(slug)}`,
image: `https://cdn.example.com/images/blog/${slug}.jpg`,
url: `https://example.com${path}`,
type: 'article'
};
}
return DEFAULT_METADATA;
}
function formatTitle(slug) {
return slug.split('-').map(w => w.charAt(0).toUpperCase() + w.slice(1)).join(' ');
}
function generateMetaTags(metadata) {
const escape = (str) => {
if (!str) return '';
return String(str)
.replace(/&/g, '&amp;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;')
.replace(/"/g, '&quot;')
.replace(/'/g, '&#039;');
};
const { title, description, image, url, type = 'website', siteName = 'My SPA' } = metadata;
return `
<!-- Open Graph / Facebook -->
<meta property="og:type" content="${escape(type)}">
<meta property="og:url" content="${escape(url)}">
<meta property="og:title" content="${escape(title)}">
<meta property="og:description" content="${escape(description)}">
<meta property="og:image" content="${escape(image)}">
<meta property="og:site_name" content="${escape(siteName)}">
<!-- Twitter Card -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:url" content="${escape(url)}">
<meta name="twitter:title" content="${escape(title)}">
<meta name="twitter:description" content="${escape(description)}">
<meta name="twitter:image" content="${escape(image)}">
<!-- Standard Meta Tags -->
<meta name="description" content="${escape(description)}">
<title>${escape(title)}</title>
`.trim();
}
function injectMetaTags(response, metadata) {
const contentType = response.headers['content-type']?.[0]?.value || '';
if (!contentType.includes('text/html')) {
return response;
}
try {
let html = response.body;
const isBase64 = response.bodyEncoding === 'base64';
if (isBase64) {
html = Buffer.from(html, 'base64').toString('utf8');
}
const metaTags = generateMetaTags(metadata);
const headCloseIndex = html.indexOf('</head>');
if (headCloseIndex === -1) {
return response;
}
const modifiedHtml = html.slice(0, headCloseIndex) + metaTags + '\n' + html.slice(headCloseIndex);
if (isBase64) {
response.body = Buffer.from(modifiedHtml).toString('base64');
response.bodyEncoding = 'base64';
} else {
response.body = modifiedHtml;
}
response.headers['content-length'] = [{
key: 'Content-Length',
value: Buffer.byteLength(modifiedHtml, 'utf8').toString()
}];
return response;
} catch (error) {
console.error('Injection error:', error);
return response;
}
}
EOF
# Package Lambda function
cd lambda
zip -q function.zip index.js
cd ..
Complete Terraform Configuration
Create the Terraform files:
# terraform/main.tf
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
# Lambda@Edge MUST be in us-east-1
provider "aws" {
region = "us-east-1"
alias = "us_east_1"
}
provider "aws" {
region = var.aws_region
}
# S3 Bucket for SPA static files
resource "aws_s3_bucket" "spa_bucket" {
bucket = var.s3_bucket_name
tags = {
Name = "SPA Static Assets"
Environment = var.environment
}
}
# Block public access
resource "aws_s3_bucket_public_access_block" "spa_bucket" {
bucket = aws_s3_bucket.spa_bucket.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
# CloudFront Origin Access Identity
resource "aws_cloudfront_origin_access_identity" "spa_oai" {
comment = "OAI for ${var.s3_bucket_name}"
}
# S3 bucket policy - allow CloudFront OAI
resource "aws_s3_bucket_policy" "spa_bucket_policy" {
bucket = aws_s3_bucket.spa_bucket.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "AllowCloudFrontOAI"
Effect = "Allow"
Principal = {
AWS = aws_cloudfront_origin_access_identity.spa_oai.iam_arn
}
Action = "s3:GetObject"
Resource = "${aws_s3_bucket.spa_bucket.arn}/*"
}
]
})
}
# IAM Role for Lambda@Edge
resource "aws_iam_role" "lambda_edge_role" {
provider = aws.us_east_1
name = "${var.project_name}-lambda-edge-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = [
"lambda.amazonaws.com",
"edgelambda.amazonaws.com"
]
}
}
]
})
}
# IAM Policy for Lambda@Edge
resource "aws_iam_role_policy" "lambda_edge_policy" {
provider = aws.us_east_1
role = aws_iam_role.lambda_edge_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
]
Resource = "arn:aws:logs:*:*:*"
}
]
})
}
# Lambda@Edge Function
resource "aws_lambda_function" "seo_ogp_injector" {
provider = aws.us_east_1
filename = "${path.module}/../lambda/function.zip"
function_name = "${var.project_name}-seo-ogp-injector"
role = aws_iam_role.lambda_edge_role.arn
handler = "index.handler"
runtime = "nodejs20.x"
publish = true # Required for Lambda@Edge
timeout = 5
memory_size = 128
source_code_hash = filebase64sha256("${path.module}/../lambda/function.zip")
tags = {
Name = "SEO OGP Injector"
Environment = var.environment
}
}
# CloudFront Distribution
resource "aws_cloudfront_distribution" "spa_distribution" {
enabled = true
is_ipv6_enabled = true
default_root_object = "index.html"
price_class = var.cloudfront_price_class
comment = "${var.project_name} SPA with Lambda@Edge SEO"
origin {
domain_name = aws_s3_bucket.spa_bucket.bucket_regional_domain_name
origin_id = "S3-${var.s3_bucket_name}"
s3_origin_config {
origin_access_identity = aws_cloudfront_origin_access_identity.spa_oai.cloudfront_access_identity_path
}
}
default_cache_behavior {
allowed_methods = ["GET", "HEAD", "OPTIONS"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "S3-${var.s3_bucket_name}"
forwarded_values {
query_string = false
headers = ["User-Agent"] # Required for bot detection
cookies {
forward = "none"
}
}
viewer_protocol_policy = "redirect-to-https"
min_ttl = 0
default_ttl = 86400 # 24 hours
max_ttl = 604800 # 7 days
compress = true
# Lambda@Edge association
lambda_function_association {
event_type = "viewer-response"
lambda_arn = aws_lambda_function.seo_ogp_injector.qualified_arn
include_body = false
}
}
# SPA routing: 404 -> index.html
custom_error_response {
error_code = 404
response_code = 200
response_page_path = "/index.html"
}
custom_error_response {
error_code = 403
response_code = 200
response_page_path = "/index.html"
}
restrictions {
geo_restriction {
restriction_type = "none"
}
}
viewer_certificate {
cloudfront_default_certificate = true
# For custom domain:
# acm_certificate_arn = var.acm_certificate_arn
# ssl_support_method = "sni-only"
# minimum_protocol_version = "TLSv1.2_2021"
}
tags = {
Name = "${var.project_name} CloudFront"
Environment = var.environment
}
}
# terraform/variables.tf
variable "aws_region" {
description = "AWS region for resources (CloudFront is global)"
type = string
default = "us-east-1"
}
variable "project_name" {
description = "Project name for resource naming"
type = string
default = "spa-seo-lambda-edge"
}
variable "s3_bucket_name" {
description = "S3 bucket name for SPA static files"
type = string
}
variable "environment" {
description = "Environment (dev, staging, production)"
type = string
default = "production"
}
variable "cloudfront_price_class" {
description = "CloudFront price class (PriceClass_100, PriceClass_200, PriceClass_All)"
type = string
default = "PriceClass_100" # US, Canada, Europe
}
# terraform/outputs.tf
output "cloudfront_domain_name" {
value = aws_cloudfront_distribution.spa_distribution.domain_name
description = "CloudFront distribution domain name"
}
output "cloudfront_distribution_id" {
value = aws_cloudfront_distribution.spa_distribution.id
description = "CloudFront distribution ID (for cache invalidation)"
}
output "s3_bucket_name" {
value = aws_s3_bucket.spa_bucket.id
description = "S3 bucket name"
}
output "lambda_function_arn" {
value = aws_lambda_function.seo_ogp_injector.qualified_arn
description = "Lambda@Edge function ARN with version"
}
output "cloudfront_url" {
value = "https://${aws_cloudfront_distribution.spa_distribution.domain_name}"
description = "Full CloudFront URL"
}
Deployment Steps
# 1. Create terraform.tfvars
cat > terraform/terraform.tfvars << 'EOF'
s3_bucket_name = "my-unique-spa-bucket-name-12345"
project_name = "my-spa-seo"
environment = "production"
EOF
# 2. Initialize Terraform
cd terraform
terraform init
# 3. Plan deployment (review changes)
terraform plan
# 4. Apply infrastructure
terraform apply
# Review the plan, type 'yes' to proceed
# 5. Deploy SPA files to S3
# (Replace with your SPA build output directory)
aws s3 sync ../spa-dist s3://my-unique-spa-bucket-name-12345/ --delete
# 6. Invalidate CloudFront cache
DISTRIBUTION_ID=$(terraform output -raw cloudfront_distribution_id)
aws cloudfront create-invalidation \
--distribution-id $DISTRIBUTION_ID \
--paths "/*"
# 7. Get CloudFront URL
terraform output cloudfront_url
Important Notes
Lambda@Edge Deployment Time:
- Lambda@Edge replication to all edge locations takes 15-30 minutes
- During this time, some requests may not have the Lambda@Edge function available
- Plan deployments accordingly (use blue/green or canary deployments for production)
Updating Lambda Function:
# After modifying lambda/index.js:
# 1. Re-package
cd lambda
zip -q function.zip index.js
cd ..
# 2. Apply Terraform
cd terraform
terraform apply
# 3. Wait for replication (15-30 minutes)
Cost Estimation:
# Use AWS Pricing Calculator
# https://calculator.aws/
# Typical monthly cost for 1M requests:
# - CloudFront: $0.75-1.00
# - Lambda@Edge: $0.40-0.60
# - S3: $0.10-0.20
# Total: ~$1.25-1.80/month
Testing & Validation
After deployment, validate the implementation using multiple methods.
1. Local cURL Testing
# Set your CloudFront domain
DOMAIN="d1234abcd5678.cloudfront.net"
# Test 1: Facebook bot (should inject meta tags)
curl -H "User-Agent: facebookexternalhit/1.1" \
https://$DOMAIN/ | grep -i "og:title"
# Expected output:
# <meta property="og:title" content="My SPA - Home">
# Test 2: Googlebot (should inject meta tags)
curl -H "User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1)" \
https://$DOMAIN/blog/test-post | grep -i "og:title"
# Test 3: Normal browser (should NOT inject)
curl -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/120.0" \
https://$DOMAIN/ | grep -i "og:title"
# Expected: No og:title in output (original SPA shell)
2. Social Media Validators
Facebook Sharing Debugger:
- Visit https://developers.facebook.com/tools/debug/
- Enter your CloudFront URL:
https://d1234abcd5678.cloudfront.net/blog/post - Click "Scrape Again" to refresh cache
- Verify preview shows correct:
- Title
- Description
- Image (1200x630 recommended)
Twitter Card Validator:
- Visit https://cards-dev.twitter.com/validator
- Enter your URL
- Verify card preview appears correctly
- Check that both og:image and twitter:image are present
LinkedIn Post Inspector:
- Visit https://www.linkedin.com/post-inspector/
- Enter your URL
- Inspect the preview
- Verify professional appearance
3. CloudWatch Logs Verification
# View Lambda@Edge logs (appear in edge regions)
# Find your closest region
# US East Coast
aws logs tail /aws/lambda/us-east-1.my-spa-seo-seo-ogp-injector \
--region us-east-1 \
--follow
# Europe
aws logs tail /aws/lambda/eu-west-1.my-spa-seo-seo-ogp-injector \
--region eu-west-1 \
--follow
# CloudWatch Insights query for bot detection stats
aws logs insights query \
--log-group-name /aws/lambda/us-east-1.my-spa-seo-seo-ogp-injector \
--start-time $(date -u -d '1 hour ago' +%s) \
--end-time $(date -u +%s) \
--query-string 'fields @timestamp, userAgent, path, isBot | filter isBot = true | stats count() by userAgent'
4. Performance Testing
# Measure latency with ApacheBench
ab -n 100 -c 10 \
-H "User-Agent: facebookexternalhit/1.1" \
https://$DOMAIN/
# Key metrics to check:
# - Time per request: should be < 100ms (after cache warm-up)
# - Failed requests: should be 0
# - 95th percentile: should be < 150ms
5. Automated Testing Script
#!/bin/bash
# test-lambda-edge.sh
DOMAIN="$1"
if [ -z "$DOMAIN" ]; then
echo "Usage: ./test-lambda-edge.sh <cloudfront-domain>"
exit 1
fi
echo "Testing Lambda@Edge deployment on: $DOMAIN"
echo "============================================"
# Test bots (should have meta tags)
echo -e "\n1. Testing Facebook bot..."
RESULT=$(curl -s -H "User-Agent: facebookexternalhit/1.1" https://$DOMAIN/ | grep -c "og:title")
if [ "$RESULT" -gt 0 ]; then
echo "✓ Facebook bot: Meta tags injected"
else
echo "✗ Facebook bot: Meta tags NOT found"
fi
echo -e "\n2. Testing Googlebot..."
RESULT=$(curl -s -H "User-Agent: Googlebot/2.1" https://$DOMAIN/ | grep -c "og:title")
if [ "$RESULT" -gt 0 ]; then
echo "✓ Googlebot: Meta tags injected"
else
echo "✗ Googlebot: Meta tags NOT found"
fi
# Test human (should NOT have injected meta tags in SPA shell)
echo -e "\n3. Testing human browser..."
RESULT=$(curl -s -H "User-Agent: Mozilla/5.0 Chrome/120.0" https://$DOMAIN/ | grep -c "og:title")
if [ "$RESULT" -eq 0 ]; then
echo "✓ Human browser: Original SPA shell returned"
else
echo "⚠ Human browser: Unexpected meta tags found"
fi
echo -e "\n============================================"
echo "Testing complete!"
Run the test script:
chmod +x test-lambda-edge.sh
./test-lambda-edge.sh d1234abcd5678.cloudfront.net
Common Issues & Solutions
| Issue | Cause | Solution |
|---|---|---|
| Meta tags not appearing | User-Agent header not forwarded | Check CloudFront cache behavior forwarded_values.headers = ["User-Agent"] |
| Old content cached | CloudFront cache not invalidated | Run aws cloudfront create-invalidation --distribution-id XXX --paths "/*" |
| Lambda not executing | Function not published | Ensure publish = true in Terraform |
| 403 Access Denied | S3 bucket policy missing | Verify OAI has GetObject permission |
| Logs not visible | Looking in wrong region | Check CloudWatch Logs in edge region closest to you |
Conclusion
You've now implemented a production-grade solution for SPA SEO and social media sharing using AWS Lambda@Edge. This approach delivers:
Key Benefits:
- ✅ Cost-effective: ~$1-2/month vs $99-249/month for pre-rendering services
- ✅ Low-latency: 8-15ms execution time at the edge
- ✅ Framework-agnostic: Works with any SPA (React, Vue, Angular, Svelte)
- ✅ Non-invasive: No changes to existing SPA code
- ✅ Production-ready: Complete with error handling and monitoring
When to Use This Approach:
- ✅ Existing SPA that can't be easily refactored to SSR
- ✅ Need SEO and social sharing improvements quickly
- ✅ Cost-conscious (< $5/month for most sites)
- ✅ Already using AWS CloudFront
- ✅ Want framework-agnostic solution
When NOT to Use This:
- ❌ Building new application → Use SSR framework (Next.js, Nuxt, SvelteKit)
- ❌ Need real-time dynamic content in meta tags → Consider full SSR
- ❌ Very complex metadata logic → May need origin-request trigger
- ❌ Not on AWS → Look at Cloudflare Workers or Vercel Edge Functions
Next Steps
Customize metadata:
- Update
METADATA_MAPin Lambda function - Add routes specific to your application
- Configure fallback metadata
- Update
Optimize for your use case:
- Adjust CloudFront cache TTLs
- Monitor bot traffic patterns
- Fine-tune bot detection patterns
Add custom domain (optional):
# In terraform/main.tf viewer_certificate block: viewer_certificate { acm_certificate_arn = "arn:aws:acm:us-east-1:ACCOUNT:certificate/ID" ssl_support_method = "sni-only" minimum_protocol_version = "TLSv1.2_2021" } # Add aliases aliases = ["www.example.com", "example.com"]Set up monitoring:
- CloudWatch dashboard for Lambda invocations
- Alerts for error rates > 1%
- Cost monitoring with AWS Budgets
Consider advanced features:
- Multi-language support (og:locale)
- A/B testing different meta tags
- Dynamic image generation for OG images
- Structured data (JSON-LD) injection
Resources
Code & Examples:
- GitHub Repository - Complete working implementation with Terraform, tests, and example SPA
AWS Documentation:
SEO & Social Media:
- Open Graph Protocol Specification
- Facebook Sharing Debugger
- Twitter Card Validator
- Google Search Central: JavaScript SEO
Lambda@Edge offers an elegant solution for SPAs struggling with SEO and social sharing. By intercepting bot traffic at the edge and dynamically injecting meta tags, you get the best of both worlds: modern SPA architecture for users and SEO-friendly responses for crawlers—all without rewriting your application.
If you found this guide helpful, share it on social media to test your OGP implementation!