SvelteKit Translation Caching: Upstash Redis + KV Variable Naming Explained

The problem article four leaves unsolved

After article four, on-demand translation works correctly but without any caching. Without a cache, every request to a translated article triggers a Claude API call - growing linearly with traffic and entirely avoidable.

The request flow after this article:

request → cache hit  → serve immediately (zero API cost)
request → cache miss → Claude API → render → store HTML in cache → serve

The cache stores pre-rendered, Shiki-highlighted HTML - not raw prose segments. This means cache hits require zero additional work: no re-reading the source file, no segment reassembly, no Shiki pass. The server reads one Redis key and returns the string directly.

Two caching layers work together:

Redis - stores the final HTML per article per locale. Complete translations are stored permanently with no TTL; they only expire when the article source changes and a GitHub Actions deploy hook invalidates them. Incomplete translations (where the model dropped some segments) get a 2-hour TTL so they automatically retry without manual intervention.
Vercel CDN - caches the full HTTP response at the edge. The cache lifetime is set dynamically from the load function based on missingCount: permanent for complete translations, 2 hours for incomplete ones. This keeps both layers in sync rather than applying a flat expiration to everything.

One important clarification after production hardening: this Redis layer reduces translation cost and repeat latency, but it was not the main fix for the later EMFILE incident by itself. The real runtime failure came from excessive Redis fan-out on the admin analytics path. Caching helps, lazy client initialisation helps, but Redis-heavy admin routes still need bounded concurrency.

If you want the deeper production debugging context behind those hardening choices, see Debugging EMFILE on Vercel in a Serverless SvelteKit App.

A note on Vercel KV

If you have seen @vercel/kv in older tutorials, be aware that Vercel KV has been sunset. New projects can no longer create Vercel KV stores. The replacement is Upstash Redis, available through the Vercel Marketplace.

However, Vercel kept the KV_ naming convention for environment variables when it migrated to Upstash, for backward compatibility with existing apps. This means the variable names Vercel generates look like KV_REST_API_URL rather than UPSTASH_REDIS_REST_URL, even though the underlying service is now Upstash. This naming difference is explained fully in step 2d below.

Do you need Vercel Pro?

No. Upstash Redis through the Vercel Marketplace is available on the free Hobby plan. The Upstash free tier includes 10,000 commands per day and 256 MB of storage - more than sufficient for a translation cache.

The number that tends to mislead here is visitor count. The instinct is to reason: 1,000 visitors per day across two locales means roughly 1,000 Redis reads per day. That would put you at 10 percent of your free tier limit, which sounds fine but is actually the wrong mental model entirely, and understanding why it is wrong is what makes the free tier feel genuinely spacious rather than just barely adequate.

The critical thing to understand is how the CDN layer changes the Redis math. Once a translated article has been served once, Vercel CDN caches the full HTTP response at the edge with s-maxage=31536000 (one year for complete translations). Every subsequent visitor to that same URL is served directly from the CDN, which means the SvelteKit server is never reached and Redis is never queried.

This means Redis commands scale with unique article-locale pairs receiving their first visit, not with visitor count. To see why, it helps to walk through what actually happens per command type.

A Redis read happens only when the CDN misses - that is, when a translated URL has never been cached before, when an article has been invalidated after a source update, or when an incomplete translation’s 2-hour TTL has expired and the CDN re-fetches from the server. Every other request for that URL hits the CDN edge and costs zero Redis commands.

A Redis write happens when a translation is produced for the first time: one write for the full HTML entry (t: key) and one for the compact title map (tm: key), so 2 writes per article-locale pair. An invalidation on article update costs 2 deletes for the same pair of keys.

That gives a concrete formula for daily Redis usage at steady state:

Daily Redis commands =
  (new articles published per day × locales × 3)   ← 1 read miss + 2 writes per new pair
  + (articles updated per day × locales × 2)        ← 2 deletes per updated pair
  + (incomplete TTL expirations × 3)                ← re-read miss + 2 writes on retry

To put real numbers to it: suppose you publish 3 articles per week in 2 locales and update 1 existing article per week. New article-locale pairs per week: 3 × 2 = 6, costing 6 × 3 = 18 Redis commands. The one updated article costs 2 × 2 = 4 deletes, and the first visitor after invalidation triggers 1 read miss + 2 writes = 3 more, so 7 total. Weekly commands: 18 + 7 = 25, or roughly 4 commands per day from a publishing cadence that most developers would consider quite active.

For your existing catalog, say 50 articles across 2 locales, once every article-locale pair has been visited at least once and cached by the CDN with a one-year TTL, those 100 pairs contribute essentially zero ongoing Redis commands regardless of traffic volume. A thousand visitors per day all hitting cached translated articles: still zero commands.

The free tier limit of 10,000 commands per day gives you roughly 2,500 times that headroom under a typical publishing rhythm. You would need to be publishing hundreds of new translated articles every single day before Redis usage became something to think about.

Step 1 - Install the client package

pnpm add @upstash/redis

Step 2 - Create and link the Upstash Redis database on Vercel

This is the complete walkthrough for a new project. If you already have an Upstash database linked, skip to step 3.

2a - Open your project in the Vercel dashboard

Go to vercel.com/dashboard, select your project, and click the Storage tab in the top navigation.

2b - Add the Upstash integration and choose the right product

Click Browse Marketplace. Find Upstash and click it. You will see the Upstash parent entry expand to reveal four sub-products:

Upstash for Redis ← choose this one
Upstash Vector - for vector/embedding storage (AI similarity search, not what we need)
Upstash QStash/Workflow - for message queues and background jobs
Upstash Search - for full-text search

Click Upstash for Redis. If you do not have an Upstash account, the integration flow will create one tied to your Vercel account - no separate signup is required.

2c - Create a new Redis database

In the Upstash integration flow, click Create new database. Choose a database name (e.g. my-project-translations). Select the region closest to your Vercel deployment region - for most European deployments that is eu-west-1, for US East it is us-east-1. The closer the region, the lower the Redis read latency.

Leave the plan as Free - the free tier is sufficient for translation caching at any reasonable blog scale.

2d - Link the database and understand the generated variable names

After the database is created, you are prompted to link it to one or more Vercel projects. Select your project and confirm.

Vercel will inject environment variables into your project - but they will not be named UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN. Instead, they will be prefixed with your project name and use the legacy KV_ naming convention that Vercel kept for backward compatibility. For a project named my-blog, you will see:

MY_BLOG_KV_REST_API_URL
MY_BLOG_KV_REST_API_TOKEN
MY_BLOG_KV_REST_API_READ_ONLY_TOKEN
MY_BLOG_KV_URL
MY_BLOG_REDIS_URL

The PROJECT_NAME_KV_ prefix is applied by Vercel to avoid collisions when a project has multiple integrations. The KV_ naming comes from Vercel’s old KV product and was preserved to avoid breaking existing apps during the migration to Upstash.

Of the variables injected, only two are needed for @upstash/redis:

Vercel generates	What it maps to
`YOUR_PROJECT_KV_REST_API_URL`	The Redis REST endpoint
`YOUR_PROJECT_KV_REST_API_TOKEN`	The read-write token
`YOUR_PROJECT_KV_REST_API_READ_ONLY_TOKEN`	Not needed
`YOUR_PROJECT_KV_URL`	TCP connection string - not used by `@upstash/redis`
`YOUR_PROJECT_REDIS_URL`	TCP connection string - not used by `@upstash/redis`

The @upstash/redis client only uses the REST API (HTTPS), not the TCP protocol - so the KV_URL and REDIS_URL variables are irrelevant for this setup.

2e - Add environment variable aliases

The code in src/lib/i18n/cache.ts reads UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN - the canonical names used in all Upstash documentation. The simplest fix is to add two alias variables in Vercel that point to the same values, without touching any code.

Go to Vercel → your project → Settings → Environment Variables and add:

Variable name	Value (copy from)	Environments
`UPSTASH_REDIS_REST_URL`	value of `YOUR_PROJECT_KV_REST_API_URL`	Production, Preview, Development
`UPSTASH_REDIS_REST_TOKEN`	value of `YOUR_PROJECT_KV_REST_API_TOKEN`	Production, Preview, Development

You can find the actual values by clicking the eye icon next to each variable in the Vercel dashboard, or from the Upstash console at console.upstash.com.

This approach keeps the code exactly as written without any changes - and it keeps the code portable for anyone else following this series.

2f - Pull the variables to your local environment

vercel env pull .env.local

This writes all project environment variables - including your new aliases - to .env.local. SvelteKit’s Vite integration loads .env.local automatically on pnpm dev.

Confirm .env.local is in .gitignore - it should already be, as SvelteKit includes it by default. Both the Redis token and your ANTHROPIC_API_KEY live here and must never be committed.

After pulling, verify the aliases are present:

grep UPSTASH .env.local
# Should show:
# UPSTASH_REDIS_REST_URL=https://...
# UPSTASH_REDIS_REST_TOKEN=...

Step 3 - The cache module

The final production cache module in this project is slightly richer than the teaching version in this article. It adds missingIds, stored proseTranslations for background healing, getCachedStatuses() for admin tooling, and a force option for admin cache writes in development. The core architectural ideas are the same: lazy Redis init, dual-key writes, conditional TTLs, and cached HTML rather than rebuilding on every hit.

Create src/lib/i18n/cache.ts. A few design decisions shape this module, explained after the code.

// src/lib/i18n/cache.ts
import { env as privateEnv } from '$env/dynamic/private'
import { dev } from '$app/environment'

// ─────────────────────────────────────────────────────────────
// TTL STRATEGY
//
// Complete translations   (missingCount === 0) → no expiry (permanent)
//   Stored until the article source changes. A GitHub Actions deploy
//   hook calls invalidateTranslation() when source files are updated.
//
// Incomplete translations (missingCount  >  0) → 2-hour TTL
//   Auto-expires so the next visitor triggers a fresh full attempt.
//   Once a complete translation succeeds the entry becomes permanent.
// ─────────────────────────────────────────────────────────────
export const INCOMPLETE_TTL_SECONDS = 60 * 60 * 2 // 2 hours

// ─────────────────────────────────────────────────────────────
// KEY NAMESPACES
//
//   t:<locale>:<slug>   Full translation cache - html + title + description
//   tm:<locale>:<slug>  Compact title map - title + description only
//
// The compact key exists so that getTranslatedTitles() can fetch N titles
// in a single mget without pulling megabytes of HTML for each article.
// ─────────────────────────────────────────────────────────────
function cacheKey(slug: string, locale: string): string {
	return `t:${locale}:${slug}`
}

function titleKey(slug: string, locale: string): string {
	return `tm:${locale}:${slug}`
}

// ─────────────────────────────────────────────────────────────
// TYPES
// ─────────────────────────────────────────────────────────────
export interface CachedTranslation {
	/**
	 * Final Shiki-highlighted HTML - injected directly via {@html} on a cache hit.
	 * Storing HTML (not prose segments) means cache hits need zero extra work:
	 * no raw post lookup, no extractSegments, no reassemble, no Shiki.
	 */
	html: string
	title: string
	description: string
	translatedAt: number
	/**
	 * 0 = fully translated and cached permanently.
	 * >0 = some segments fell back to English; cached with INCOMPLETE_TTL_SECONDS.
	 */
	missingCount: number
}

/** Compact title-map entry - stored under the tm: key to avoid
 *  fetching full HTML payloads just to render article titles in navigation. */
export interface CachedTitles {
	title: string
	description: string
}

// ─────────────────────────────────────────────────────────────
// CLIENT SINGLETON - lazy, using $env/dynamic/private
//
// Lazy-initialised to prevent the Upstash SDK from logging warnings
// when url/token are absent (e.g. during local dev without .env.local).
// Returns null when unconfigured - callers treat null as a cache miss.
// ─────────────────────────────────────────────────────────────
type Redis = import('@upstash/redis').Redis
let _redis: Redis | null = null

async function getRedis(): Promise<Redis | null> {
	if (_redis) return _redis
	const url = privateEnv.UPSTASH_REDIS_REST_URL
	const token = privateEnv.UPSTASH_REDIS_REST_TOKEN
	if (!url || !token) return null
	const { Redis } = await import('@upstash/redis')
	_redis = new Redis({ url, token })
	return _redis
}

// ─────────────────────────────────────────────────────────────
// READ - full translation
//
// Returns null in dev so every load triggers a fresh API call -
// useful when iterating on the translation prompt.
// ─────────────────────────────────────────────────────────────
export async function getCachedTranslation(
	slug: string,
	locale: string
): Promise<CachedTranslation | null> {
	if (dev) return null
	const redis = await getRedis()
	if (!redis) return null
	try {
		return await redis.get<CachedTranslation>(cacheKey(slug, locale))
	} catch (err) {
		console.warn(`[i18n cache] read failed for ${cacheKey(slug, locale)}:`, err)
		return null
	}
}

// ─────────────────────────────────────────────────────────────
// READ - batch title lookup (single Redis round trip via mget)
//
// Returns a map of slug → { title, description } using the compact tm: keys.
// Used by the locale +page.server.ts to overlay translated titles on
// prev / next / relatedPosts without fetching full HTML payloads.
// ─────────────────────────────────────────────────────────────
export async function getTranslatedTitles(
	slugs: string[],
	locale: string
): Promise<Record<string, CachedTitles>> {
	if (dev || slugs.length === 0) return {}
	const redis = await getRedis()
	if (!redis) return {}
	const keys = slugs.map((s) => titleKey(s, locale))
	try {
		const results = await redis.mget<CachedTitles[]>(...keys)
		const out: Record<string, CachedTitles> = {}
		for (let i = 0; i < slugs.length; i++) {
			const entry = results[i]
			if (entry?.title) out[slugs[i]] = entry
		}
		return out
	} catch (err) {
		console.warn(`[i18n cache] mget titles failed for locale ${locale}:`, err)
		return {}
	}
}

// ─────────────────────────────────────────────────────────────
// WRITE  (always fire-and-forget - never await inside load functions)
//
// Writes two keys per translation:
//   t:<locale>:<slug>   Full entry (html + title + description)
//   tm:<locale>:<slug>  Compact title map (title + description only)
//
// TTL policy:
//   missingCount === 0 → no expiry (permanent until invalidated)
//   missingCount  >  0 → INCOMPLETE_TTL_SECONDS (2h, auto self-heals)
// ─────────────────────────────────────────────────────────────
export async function setCachedTranslation(
	slug: string,
	locale: string,
	translation: CachedTranslation
): Promise<void> {
	if (dev) return
	const redis = await getRedis()
	if (!redis) return
	const titles: CachedTitles = {
		title: translation.title,
		description: translation.description
	}
	try {
		if (translation.missingCount > 0) {
			await Promise.all([
				redis.set(cacheKey(slug, locale), translation, { ex: INCOMPLETE_TTL_SECONDS }),
				redis.set(titleKey(slug, locale), titles, { ex: INCOMPLETE_TTL_SECONDS })
			])
		} else {
			// No expiry - both keys persist until explicitly invalidated
			await Promise.all([
				redis.set(cacheKey(slug, locale), translation),
				redis.set(titleKey(slug, locale), titles)
			])
		}
	} catch (err) {
		console.warn(`[i18n cache] write failed for ${cacheKey(slug, locale)}:`, err)
	}
}

// ─────────────────────────────────────────────────────────────
// DELETE - deletes both keys for a slug+locale pair
// ─────────────────────────────────────────────────────────────
export async function invalidateTranslation(slug: string, locale: string): Promise<void> {
	const redis = await getRedis()
	if (!redis) return
	try {
		await Promise.all([redis.del(cacheKey(slug, locale)), redis.del(titleKey(slug, locale))])
	} catch {
		// Non-fatal - stale entry serves until naturally invalidated
	}
}

export async function invalidateAllLocales(
	slug: string,
	locales: readonly string[]
): Promise<void> {
	await Promise.allSettled(locales.map((locale) => invalidateTranslation(slug, locale)))
}

Why store HTML instead of prose segments?

The original approach stored prose: Record<string, string> - a map of segment IDs to translated text. On a cache hit, the server had to read the source markdown file, run extractSegments, call reassemble to reconstruct the translated markdown, and then run the full Shiki highlighting pipeline before it could return HTML.

The current approach stores the final Shiki-highlighted HTML string directly. A cache hit is now: read one Redis key → return string. No file I/O, no segment processing, no rendering. This is the dominant code path in production (any article that has been visited once in a given locale is cached), so the saving compounds across every request.

The tradeoff is that Redis stores more data per entry - highlighted HTML is larger than raw prose text. In practice this is negligible: Upstash free tier gives 256 MB, and a heavily highlighted article rarely exceeds 200 KB.

The TTL strategy

TTL (Time-To-Live) is the most consequential design decision in the caching system, and it all hinges on a single field: missingCount in CachedTranslation. That field records how many prose segments the model failed to return, segments that fell back to English rather than being translated.

Its value drives two very different storage outcomes, and understanding why those outcomes are correct requires understanding that a cached translation can become stale in exactly two fundamentally different ways.

A translation becomes stale in exactly two ways

The first way is that the source article changes. An editor fixes a code example, adds a new section, or rewrites a paragraph. The stored translation no longer matches what is in the .md file. This is a content-change event - it happens because a human made a deliberate edit, and it is detectable at the moment of deployment.

The second way is that the translation was never quite right to begin with. When an article is very long, we can run into token limitations issues and the Claude API may not return all prose segments translated and return eg. only 47 of 50 prose segments. This means that three fell back to English because the model did not include them in its response.

The translation is stored, but it already has visible cracks on the day it was cached. This is an immediate-quality problem - the translation was incomplete from the first request.

These two types of staleness demand different responses. A time-based TTL handles neither of them well.

Why complete translations have no TTL

A perfectly complete translation does not become stale due to the passage of time. Thirty days from now, if no one has edited the source article, the stored translation is just as accurate as it was on day one. Setting a 30-day TTL means expiring a valid translation and paying for an identical API call that produces identical output. The money is real and the output is unchanged.

The correct mental model is: a complete translation is never stale until the source changes, and source changes are deterministic events you can observe. Every time a .md file is modified and pushed to the repository, the GitHub Actions workflow fires, the invalidation script deletes the Redis keys for that slug across every locale, and the next visitor triggers a fresh translation. That is the exact right moment to refresh - not 30 days after the last API call.

Storing complete translations permanently is not carelessness. It is matching the expiry signal to the actual cause of staleness. The timer is always wrong; the deploy hook is always right.

Why incomplete translations use a 2-hour TTL

When missingCount > 0, the situation is the opposite: the translation is already imperfect the moment it is stored. Some readers will see one or two paragraphs in English in an otherwise translated article. You want this to heal as quickly as reasonably possible.

The naive solution is to not cache incomplete translations at all - skip the write, and every visitor triggers a fresh attempt. This creates a worse problem: if the same three segments consistently fail (due to token limits, a tricky prompt, or a momentary API quirk), every single visitor hammers the Claude API simultaneously, paying full translation cost each time with no guarantee the next attempt succeeds. You have turned an imperfect cache hit into repeated expensive misses.

A 2-hour TTL threads the needle. The imperfect translation is served for 2 hours, which means a burst of simultaneous visitors all get the cached version and no redundant API calls fire. After 2 hours, Redis automatically expires the entry. The next visitor (likely a different person entirely) triggers a fresh full translation attempt.

If the model succeeds this time (which it usually does, since the previous failure was often a probabilistic miss rather than a systematic one), the new translation is complete and stored permanently. The problem has self-healed with zero manual intervention.

The question of why 2 hours specifically comes down to balancing two competing goals. Too short (say, 15 minutes) means readers who get the broken version have almost no time before the next attempt, but also that a burst of traffic in those 15 minutes all hit the cache rather than the API.

Too long (say, 24 hours) means a reader who notices the broken paragraph and returns the next day still sees the same broken translation. Two hours is short enough to feel responsive and long enough to provide meaningful burst protection.

Why you should not need to know about incomplete translations

The most important property of the 2-hour TTL is that it makes incomplete translations operationally invisible. There is no monitoring dashboard you need to check, no alert you need to clear, no manual Redis key deletion to schedule. You simply deploy the system and it self-corrects.

A reader gets a slightly imperfect page, the entry expires in 2 hours, the next reader triggers a retry, and either the translation completes (permanent storage from then on) or it fails again (another 2-hour TTL, another auto-retry). The correction loop runs without you.

This matters more as your catalog grows. At 50 articles in two locales, you could plausibly track incomplete translations manually. At 500 articles in three locales, you cannot. The self-healing mechanism needs to work without attention.

The CDN and Redis must expire together

The Cache-Control header values set by setHeaders() are chosen to match the Redis TTL exactly:

Complete translation:    Redis: no TTL       CDN: s-maxage=31536000 (1 year)
Incomplete translation:  Redis: 7200s        CDN: s-maxage=7200

The symmetry is intentional. If the CDN cached a response for 2 hours but Redis held the entry for 30 minutes, a CDN hit at the 45-minute mark would serve the cached page correctly. But a request that bypassed the CDN (a logged-in admin, a bot, a cache invalidation check) would hit the server, find a Redis miss, and trigger a fresh translation.

This creates a split where CDN users see one thing and bypass users see another. Worse: the CDN entry would be re-populated from the fresh translation, resetting the CDN clock, while the original CDN-cached version continued to serve for the remaining 75 minutes. The two layers fall out of sync.

By setting s-maxage=7200 for incomplete translations (matching INCOMPLETE_TTL_SECONDS = 60 * 60 * 2 exactly), Redis and the CDN expire at the same moment. The next request after expiry always goes through to the server, hits a Redis miss, and triggers a coherent fresh attempt. Both layers are refreshed together.

For complete translations, the notional s-maxage=31536000 is protective rather than functional, it prevents CDN churn on unchanged content. The real eviction mechanism for complete translations is the GitHub Actions revalidatePath() call that fires alongside the Redis delete during a deploy. The timer and the deploy hook both need to clear the CDN entry, and the timer is long enough that it never fires before the next relevant deploy.

The article update path is event-driven, not timer-driven

This is the clearest argument for why complete translations have no TTL. Consider what happens when you update an article.

With a time-based TTL:

source changes → deploy → readers see stale translation
                       → wait up to 30 days
                       → Redis auto-expires
                       → next visitor triggers fresh translation

With event-driven invalidation:

source changes → deploy → GitHub Actions fires
                       → invalidation script deletes Redis keys
                       → CDN entry cleared
                       → next visitor triggers fresh translation

The first flow means readers could see an outdated translation for weeks after you fixed a bug in the article’s code example. No technical writer accepts that. The second flow means the stale translation is gone within seconds of the deploy completing.

The GitHub Actions workflow that article six implements runs on every push to main that touches .md files under src/posts/. It detects which files changed using git diff HEAD~1, converts those file paths to slugs, and calls the invalidation script with each slug.

The script deletes both Redis keys (t: and tm:) for every supported locale. By the time the Vercel deployment finishes propagating, the translation cache is already cleared for every modified article.

This is why article updates are handled by an event hook rather than a short TTL. The event is precise, immediate, and scoped to exactly the articles that changed. A TTL would be imprecise, delayed, and apply equally to articles that changed and articles that did not.

Why two Redis keys per translation?

The sidebar and prev/next navigation need to show translated article titles for neighbouring articles. Fetching the full t: key for each of those articles would pull megabytes of Shiki-highlighted HTML just to display a title string.

The tm: key (title map) stores only { title, description } that is just a few dozen bytes. getTranslatedTitles() fetches all required slugs in a single mget round trip, making the nav overlay essentially free. Both keys are written together and deleted together on invalidation.

Step 4 - The load function with caching and streaming

This replaces the uncached implementation from article four. Two structural changes beyond the cache lookup are worth calling out before the code: the translation happens in a helper that returns a pending Promise rather than being awaited, and the CDN cache lifetime is set dynamically via setHeaders rather than with a static ISR expiration value. Both are explained in the sections below.

// src/routes/(locale)/[lang=locale]/[...slug]/+page.server.ts
import type { Config } from '@sveltejs/adapter-vercel'
import { error } from '@sveltejs/kit'
import type { PageServerLoad } from './$types'
import { readPost } from '$lib/server/posts'
import { getContent, content } from '$lib/content'
import { getPosition, relatedArticles } from '$lib/content/helpers'
import type { Article } from '$lib/content/types'
import { extractSegments, proseOnly, reassemble } from '$lib/i18n/extract'
import { getCachedTranslation, setCachedTranslation, getTranslatedTitles } from '$lib/i18n/cache'

// ISR with on-demand expiration (expiration: false).
// CDN eviction is driven by the Cache-Control header set dynamically in load():
//
//   Complete translation   (missingCount === 0)
//     s-maxage=31536000  - CDN caches for one year; only cleared when the
//                          article source changes and Redis is invalidated.
//
//   Incomplete translation (missingCount  >  0)
//     s-maxage=7200      - mirrors INCOMPLETE_TTL_SECONDS exactly so that
//                          CDN and Redis expire together.
//
//   Cache miss (streaming)
//     no header          - Vercel will not cache a streaming response.
export const config: Config = {
	isr: { expiration: false }
}

// Overlays translated titles onto nav articles so the sidebar and
// prev/next links show translated article titles rather than English ones.
function overlayTitles<T extends Article | undefined>(
	article: T,
	titleMap: Record<string, { title: string; description: string }>
): T {
	if (!article) return article
	const entry = titleMap[article.slug]
	if (!entry) return article
	return { ...article, title: entry.title, description: entry.description }
}

function overlayTitlesArray(
	articles: Article[],
	titleMap: Record<string, { title: string; description: string }>
): Article[] {
	return articles.map((a) => {
		const entry = titleMap[a.slug]
		return entry ? { ...a, title: entry.title, description: entry.description } : a
	})
}

// doTranslation returns a Promise that load() passes to the client WITHOUT awaiting.
// SvelteKit streams the pending promise: the page shell renders immediately
// while Claude and Shiki run in the background (10–15s on a first cache miss).
// The Redis write is fire-and-forget so it does not delay stream resolution.
async function doTranslation(
	slug: string,
	lang: string,
	englishArticle: Article,
	writeCache: boolean
): Promise<string | null> {
	let rawMarkdown = ''
	try {
		rawMarkdown = await readPost(slug)
	} catch {
		return null
	}

	const segments = extractSegments(rawMarkdown)
	const prose = proseOnly(segments)

	const { translateBatch } = await import('$lib/i18n/translate')
	const { proseTranslations, title, description, diagnostics } = await translateBatch(
		prose,
		englishArticle.title,
		englishArticle.description,
		lang
	)

	const translatedMarkdown = reassemble(segments, proseTranslations)
	const { renderMarkdownHtml } = await import('$lib/i18n/render')
	const html = await renderMarkdownHtml(translatedMarkdown)

	if (writeCache) {
		void setCachedTranslation(slug, lang, {
			html,
			title,
			description,
			translatedAt: Date.now(),
			missingCount: diagnostics.missingIds.length
		})
	}

	return html
}

export const load: PageServerLoad = async ({ params, parent, url, setHeaders }) => {
	const { lang } = await parent()
	const slugPath = params.slug
	const bypassCache = url.searchParams.has('nocache')

	const englishArticle = content.articles.find((a) => a.slug === slugPath)
	if (!englishArticle) error(404, `Could not find ${slugPath}`)

	const localeContent = getContent(lang)

	let currentPost: Article = { ...englishArticle, lang, isFallback: true }
	let translatedHtml: Promise<string | null>

	const cached = bypassCache ? null : await getCachedTranslation(slugPath, lang)

	if (cached) {
		// ── Cache hit: zero API cost ───────────────────────────────────
		currentPost = {
			...englishArticle,
			title: cached.title || englishArticle.title,
			description: cached.description || englishArticle.description,
			lang,
			isFallback: false,
			translationStatus: 'machine-translated'
		}
		// Set CDN cache lifetime based on translation completeness.
		// Complete: permanent at CDN (cleared only on content update + invalidation).
		// Incomplete: 2 hours, mirrors INCOMPLETE_TTL_SECONDS so CDN and Redis
		// expire together and the next regeneration hits a Redis miss.
		setHeaders({
			'Cache-Control':
				cached.missingCount === 0
					? 'public, s-maxage=31536000, stale-while-revalidate=86400'
					: 'public, s-maxage=7200, stale-while-revalidate=3600'
		})
		// Already resolved - no loading state shown
		translatedHtml = Promise.resolve(cached.html)
	} else {
		// ── Cache miss: stream the translation ──────────────────────────
		// Title stays English during streaming; translated title appears on
		// the next cache-hit load once the Redis entry is written.
		currentPost = {
			...englishArticle,
			lang,
			isFallback: false,
			translationStatus: 'machine-translated'
		}
		// Return as a PENDING promise - SvelteKit streams it, page shell renders first.
		translatedHtml = doTranslation(slugPath, lang, englishArticle, !bypassCache)
	}

	// Prev / next / related - all synchronous after this point.
	const { prev, next } = getPosition({ ...currentPost, slug: slugPath }, localeContent.articles)

	const prerequisites: Article[] = (currentPost.prerequisites ?? [])
		.map((id) => content.articles.find((a) => a.id === id))
		.filter((a): a is Article => a !== undefined)

	let relatedPosts: Article[] = []
	let isSeries = false

	if (currentPost.track?.id && currentPost.topic?.id) {
		isSeries = true
		const trackData = localeContent.tracks[currentPost.track.id]
		relatedPosts = trackData
			? trackData.topics
					.slice()
					.sort((a, b) => a.order - b.order)
					.flatMap((t) => t.articles)
			: []
	} else {
		const curated = (currentPost.related ?? [])
			.map((id) => content.articles.find((a) => a.id === id))
			.filter((a): a is Article => a !== undefined)
		relatedPosts = curated.length > 0 ? curated : relatedArticles(currentPost, content.articles)
	}

	// Single mget round trip to overlay translated titles on nav articles
	const navSlugs = [...relatedPosts.map((a) => a.slug), prev?.slug, next?.slug].filter(
		(s): s is string => !!s
	)
	const titleMap = await getTranslatedTitles([...new Set(navSlugs)], lang)

	return {
		meta: currentPost,
		translatedHtml, // Promise<string | null> - may be pending
		slugPath,
		prev: overlayTitles(prev, titleMap),
		next: overlayTitles(next, titleMap),
		prerequisites,
		relatedPosts: overlayTitlesArray(relatedPosts, titleMap),
		isSeries,
		currentTopicId: currentPost.topic?.id ?? null,
		lang,
		isFallback: currentPost.isFallback ?? false
	}
}

Why stream the translation promise?

The original implementation awaited doTranslation() inside load(), which meant the entire page was blocked until Claude finished translating and Shiki finished rendering which is typically 10–15 seconds on a cold cache miss. The header, sidebar, and navigation were invisible to the user for that entire duration.

SvelteKit’s streaming support allows load() to return a pending Promise as part of its data. Update +page.svelte to wrap data.translatedHtml in an {#await} block:

<!-- src/routes/(locale)/[lang=locale]/[...slug]/+page.svelte -->
<script lang="ts">
	import type { PageData } from './$types'
	import ArticleMeta from '$lib/components/article/ArticleMeta.svelte'
	import ProgressBar from '$lib/components/article/ProgressBar.svelte'

	let { data }: { data: PageData } = $props()
</script>

<svelte:head>
	<title>{data.meta.seo?.title ?? data.meta.title}</title>
	<meta name="description" content={data.meta.seo?.description ?? data.meta.description} />
	<meta property="og:title" content={data.meta.seo?.title ?? data.meta.title} />
	<meta property="og:description" content={data.meta.seo?.description ?? data.meta.description} />
	<meta property="og:type" content="article" />
</svelte:head>

<article class="article-page">
	{#if data.isFallback}
		<div class="fallback-notice" role="status" aria-live="polite">
			<span aria-hidden="true">🌐</span>
			<p>
				This article is not yet translated into
				<strong>{data.lang.toUpperCase()}</strong>. You are reading the English original.
			</p>
		</div>
	{/if}

	{#if data.meta.translationStatus === 'machine-translated' && !data.isFallback}
		<div class="translation-notice" role="status">
			<span aria-hidden="true">🤖</span>
			<p>Machine-translated. Technical terms and code are preserved in English.</p>
		</div>
	{/if}

	<ArticleMeta post={data.meta} />

	{#if data.meta.position && (data.meta.position.total ?? 1) > 1}
		<ProgressBar
			index={data.meta.position.index ?? data.meta.article?.order ?? 1}
			total={data.meta.position.total ?? 1}
			trackType={data.meta.track?.type ?? 'reference'}
		/>
	{/if}

	{#if data.prerequisites.length > 0}
		<aside class="prerequisites" aria-label="Prerequisites">
			<h2>Before you read this</h2>
			<ul>
				{#each data.prerequisites as prereq (prereq.id ?? prereq.slug)}
					<li>
						<a href="/{data.lang}/{prereq.slug}">{prereq.title}</a>
					</li>
				{/each}
			</ul>
		</aside>
	{/if}

	<!-- {#await} handles three states:
	     pending  - cache miss, Claude is translating (10-15s on first visit)
	     then     - translation resolved: render HTML or show English fallback
	     catch    - translation threw: link to English version
	     On a cache hit, Promise.resolve(cached.html) resolves synchronously
	     before the first render tick, so no flicker occurs. -->
	{#await data.translatedHtml}
		<div class="translating-loader" role="status" aria-live="polite">
			<div class="loader-skeleton"></div>
			<p class="loader-label">Translating article…</p>
		</div>
	{:then html}
		{#if html}
			<div class="prose">
				<!-- eslint-disable-next-line svelte/no-at-html-tags -->
				{@html html}
			</div>
		{:else}
			<div class="fallback-content">
				<p>
					Translation not yet available.
					<a href="/{data.slugPath}">Read the English version →</a>
				</p>
			</div>
		{/if}
	{:catch}
		<div class="fallback-content">
			<p>
				Translation failed.
				<a href="/{data.slugPath}">Read the English version →</a>
			</p>
		</div>
	{/await}

	{#if data.prev || data.next}
		<nav class="series-nav" aria-label="Series navigation">
			{#if data.prev}
				<a href="/{data.lang}/{data.prev.slug}" class="series-nav-link series-nav-prev" rel="prev">
					<span class="nav-label">← Previous</span>
					<span class="nav-title">{data.prev.title}</span>
				</a>
			{/if}
			{#if data.next}
				<a href="/{data.lang}/{data.next.slug}" class="series-nav-link series-nav-next" rel="next">
					<span class="nav-label">Next →</span>
					<span class="nav-title">{data.next.title}</span>
				</a>
			{/if}
		</nav>
	{/if}

	{#if data.relatedPosts.length > 0}
		<aside class="related-posts" aria-label="Related articles">
			<h2>Related articles</h2>
			<ul>
				{#each data.relatedPosts.slice(0, 4) as post (post.id ?? post.slug)}
					<li>
						<a href="/{data.lang}/{post.slug}">{post.title}</a>
					</li>
				{/each}
			</ul>
		</aside>
	{/if}
</article>

On a cache miss, the page shell - article header with the English title, the progress bar, prev/next navigation - renders in milliseconds. The {#await} block shows the loading state in the content area until the promise resolves. On a cache hit Promise.resolve(cached.html) resolves synchronously before the first render tick, so there is no flicker.

The {@html} directive is safe here: the content originates from your own Markdown source files processed through the Claude API, never from user input.

Why dynamic Cache-Control instead of a fixed ISR expiration?

The earlier isr: { expiration: 7200 } applied the same 2-hour CDN cache lifetime to every translated page regardless of whether the translation was complete. This meant a complete translation - which should never expire at the CDN unless the article changes - was unnecessarily regenerated every 2 hours. The regeneration was free (a Redis hit returns in milliseconds), but it was semantically wrong: the data had not changed.

The current approach uses isr: { expiration: false } (Vercel on-demand ISR) and sets the Cache-Control header dynamically from inside load() based on cached.missingCount:

State	Redis TTL	CDN `s-maxage`	Behaviour
Complete (`missingCount === 0`)	no expiry	31536000 (1 year)	Cached permanently; cleared when article source changes
Incomplete (`missingCount > 0`)	7200s	7200s	CDN and Redis expire together; next regeneration re-translates
First visit (streaming)	- written after	none	Vercel does not cache streaming responses

For article source updates, the deploy-time invalidation script in article six deletes the Redis keys. Because isr: { expiration: false } makes the CDN cache on-demand, you can pair the Redis invalidation with a Vercel revalidatePath() call in the same deploy hook to evict the CDN entry atomically.

Step 5 - Adding Vercel secrets for GitHub Actions

The deploy-time cache invalidation workflow in article six runs after each deploy to clear stale translation entries for changed articles. It needs Redis credentials available in GitHub Actions.

Add both environment variables as repository secrets:

Go to your GitHub repository → Settings → Secrets and variables → Actions
Click New repository secret
Add UPSTASH_REDIS_REST_URL with the value from your Upstash dashboard or from .env.local
Add UPSTASH_REDIS_REST_TOKEN with the token value

These are the alias variables you created in step 2e - their values come from YOUR_PROJECT_KV_REST_API_URL and YOUR_PROJECT_KV_REST_API_TOKEN respectively. Using the canonical alias names in GitHub Actions keeps the workflow portable and consistent with the code.

You also need three Vercel-specific secrets for the deployment step. The Vercel token is created at vercel.com/account/tokens. The org ID and project ID are found in your Vercel project settings under General.

Secret name	Where to find it
`UPSTASH_REDIS_REST_URL`	Upstash console or `.env.local` after `vercel env pull`
`UPSTASH_REDIS_REST_TOKEN`	Upstash console or `.env.local`
`ANTHROPIC_API_KEY`	console.anthropic.com → API Keys
`VERCEL_TOKEN`	vercel.com/account/tokens
`VERCEL_ORG_ID`	Vercel project settings → General
`VERCEL_PROJECT_ID`	Vercel project settings → General

Article six provides the complete GitHub Actions workflow file and the invalidation script.

Local development and testing

Local development raises two distinct concerns: avoiding the cache entirely while iterating on translation prompt changes, and testing the caching layer itself before deploying.

Testing translation without the cache

The dev guard at the top of getCachedTranslation and setCachedTranslation already handles this. In dev mode, getCachedTranslation always returns null and setCachedTranslation is a no-op, so every page load in development goes directly to the Claude API and produces a fresh translation. This is correct behaviour for development - you want to see the effect of prompt changes immediately rather than serving a stale cached result.

The one downside is that every locale page load in dev hits the Claude API, adding 3 to 6 seconds of latency and a small cost. For most development work this is acceptable. For testing layout or component changes that have nothing to do with translation, temporarily hardcode a translated string in the load function to skip the API call entirely.

Article six provides the dev preview tool at http://localhost:5173/dev/translations - a better interface for iterating on the translation prompt than navigating to translated article URLs.

Testing the cache layer locally

To test that caching, invalidation, and TTL behaviour work correctly before deploying, use the Upstash local emulator:

npx @upstash/redis-local

Then point your client at the emulator in .env.local:

# .env.local - local emulator override (only while testing cache locally)
UPSTASH_REDIS_REST_URL=http://localhost:8079
UPSTASH_REDIS_REST_TOKEN=local-dev-token

The emulator accepts any token value. Data is in-memory and resets on restart, which makes it ideal for testing cache miss → hit → invalidation flows without touching your production database.

You also need to temporarily comment out the dev guard in cache.ts while testing locally, since the dev flag would otherwise skip all cache operations regardless of which URL the client points to.

A plain Docker Redis instance (docker run -d -p 6379:6379 redis:alpine) does not work with @upstash/redis - the client uses Upstash’s HTTP REST API, not the standard Redis TCP wire protocol. Use the official emulator.

The complete cost picture

All figures assume Claude Sonnet 4.6 ($3 input / $15 output per million tokens), an average article of 3,000 words with 1,800 words of translatable prose, 2 locales, and 5 new articles per month. The Batch API 50% discount applies to the initial catalog translation. Upstash Redis is free on the Hobby plan at these volumes.

Catalog size	Initial batch (2 locales)	Monthly ongoing	Year 1 total	Year 2+ annual
50 articles	$2	$0.44	$7	$5
100 articles	$4	$0.44	$9	$5
250 articles	$11	$0.44	$16	$5
500 articles	$21	$0.44	$26	$5
1,000 articles	$42	$0.44	$47	$5

Year 2 and beyond are almost entirely driven by new publications - the existing catalog sits in Redis and costs nothing to serve. Monthly ongoing cost of $0.44 is determined entirely by publication cadence, not catalog size.

Environment variables summary

Here is every environment variable the i18n system uses, where it comes from, and which contexts need it.

Variables Vercel generates automatically (from the Upstash for Redis integration):

Variable Vercel generates	What it is
`YOUR_PROJECT_KV_REST_API_URL`	Redis REST endpoint - the one you need
`YOUR_PROJECT_KV_REST_API_TOKEN`	Read-write token - the one you need
`YOUR_PROJECT_KV_REST_API_READ_ONLY_TOKEN`	Read-only token - not used
`YOUR_PROJECT_KV_URL`	TCP connection string - not used by `@upstash/redis`
`YOUR_PROJECT_REDIS_URL`	TCP connection string - not used

Variables you add manually (canonical alias names the code expects):

Variable	Value (copy from)	Required in
`UPSTASH_REDIS_REST_URL`	`YOUR_PROJECT_KV_REST_API_URL`	Vercel (all envs), `.env.local`, GitHub Actions
`UPSTASH_REDIS_REST_TOKEN`	`YOUR_PROJECT_KV_REST_API_TOKEN`	Vercel (all envs), `.env.local`, GitHub Actions
`ANTHROPIC_API_KEY`	console.anthropic.com → API Keys	Vercel (all envs), `.env.local`, GitHub Actions
`VERCEL_TOKEN`	vercel.com/account/tokens	GitHub Actions only
`VERCEL_ORG_ID`	Vercel project settings	GitHub Actions only
`VERCEL_PROJECT_ID`	Vercel project settings	GitHub Actions only

After adding the aliases in Vercel Settings, pull everything to local in one command:

vercel env pull .env.local

Add ANTHROPIC_API_KEY manually since it comes from the Anthropic console, not Vercel.

Key takeaways

Upstash for Redis is the correct product in the Vercel Marketplace - not Vector, QStash, or Search
Vercel generates YOUR_PROJECT_KV_REST_API_URL / YOUR_PROJECT_KV_REST_API_TOKEN - the KV_ naming is legacy from the sunset Vercel KV product, kept for backward compatibility
Add UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN as alias variables in Vercel Settings - no code changes needed
Upstash Redis is available free on the Vercel Hobby plan - no Pro subscription required
The Redis client uses the same lazy singleton pattern as the Anthropic client - getRedis() returns null rather than throwing when env vars are absent, so callers treat a missing client as a cache miss
Store pre-rendered HTML, not prose segments. Cache hits require zero extra work: no file I/O, no segment assembly, no Shiki pass
Complete translations have no TTL because staleness is caused by source changes, not the passage of time; the GitHub Actions invalidation hook is the right signal, not a timer
Incomplete translations use a 2-hour TTL so they self-heal automatically without monitoring: the broken entry expires, the next visitor triggers a retry, and a successful retry stores the result permanently
CDN s-maxage mirrors the Redis TTL exactly (7200s for incomplete, one year for complete) so both layers expire together and never fall out of sync
Dual-key writes: t: stores the full HTML entry, tm: stores only title + description for efficient batch nav title overlays via a single mget call
Dynamic imports inside doTranslation keep the Anthropic SDK and Shiki out of the shared cold-start bundle: await import('$lib/i18n/translate') and await import('$lib/i18n/render') are only executed on cache misses, not on every route initialisation
Stream the translation promise. Returning doTranslation() unawaited lets the page shell render immediately; the {#await} block in +page.svelte shows a loading state while Claude and Shiki run in the background
Dynamic Cache-Control via setHeaders() replaces the static ISR expiration. Complete translations set s-maxage=31536000; incomplete ones set s-maxage=7200 to match the Redis TTL and auto-retry together
setCachedTranslation is void-called inside doTranslation - fire-and-forget prevents Redis write latency from blocking the stream resolution
Redis caching reduces translation cost, but EMFILE-style failures still require bounded concurrency on Redis-heavy admin routes - caching alone does not fix fan-out
Article six provides the complete CLI scripts, dev preview tool, deploy-time invalidation workflow, and Batch API catalog pipeline

Caching Translations with Upstash Redis