Length³: A Technical Blog Built on Astro 6 and Cloudflare Workers
A deep dive into how this blog is built — Astro 6, MDX content collections, Pagefind search, and Japanese typography, all deployed at the edge.
Length³ is a technical blog engineered as a portfolio piece: every architectural decision is deliberate, every dependency earns its place, and the reading experience is treated as a first-class concern. This post documents the full stack — from content pipeline to edge deployment — so you can understand what was built and why.
Architecture Overview
The stack has three layers:
| Layer | Technology |
|---|---|
| Framework | Astro 6 (static output + Cloudflare adapter) |
| Content | MDX + Content Layer API (Zod-validated) |
| Runtime | Cloudflare Workers (edge-native) |
The blog is statically generated at build time (output: 'static').
Static assets — HTML pages, fonts, and the Pagefind index — are bundled into the Worker and served from Cloudflare's edge network with no origin server.
Content System
Content Layer API and Zod Schema
Every article lives in src/content/blog/ as either .md or .mdx.
The collection is defined with a strict Zod schema:
const blog = defineCollection({
loader: glob({
pattern: '**/*.{md,mdx}',
base: './src/content/blog',
}),
schema: z.object({
title: z.string(),
description: z.string(),
publishDate: z.coerce.date(),
updatedDate: z.coerce.date().optional(),
tags: z.array(z.string()).default([]),
draft: z.boolean().default(false),
lang: z.enum(['en', 'ja']).default('en'),
}),
});
z.coerce.date() lets you write 2026-03-28 in frontmatter and receive a native Date object in templates — no manual parsing.
draft: true excludes a post from all listing pages via a shared predicate:
export const notDraft = (entry: BlogPost) => !entry.data.draft;
// Usage:
const allPosts = await getCollection('blog', notDraft);
Frontmatter Fields
| Field | Type | Purpose |
|---|---|---|
title | string | Required. Article heading and <title> |
description | string | Required. Lead paragraph and meta description |
publishDate | YYYY-MM-DD | Required. Shown in article header and sort order |
updatedDate | YYYY-MM-DD | Optional. Shown as "(upd. …)" next to publish date |
tags | string[] | Routed to /tags/[slug], CJK-safe slugification |
draft | boolean | Defaults false. true hides the post from all listings |
lang | 'en' | 'ja' | Controls date format and lang attribute |
MDX and Component Imports
MDX lets you import Astro components directly into article prose.
The RubyText component, for example, wraps a base text and its phonetic reading into a proper <ruby> element:
---
interface Props {
base: string;
reading: string;
}
const { base, reading } = Astro.props;
---
<ruby>{base}<rp>(</rp><rt>{reading}</rt><rp>)</rp></ruby>
In an article you write:
import RubyText from '../../components/RubyText.astro';
<RubyText base="東京" reading="とうきょう" />
は日本の首都です。
Which renders as: 東京は日本の首都です。
The <rp> fallback parentheses are included so the annotation degrades gracefully in browsers or screen readers that do not support <ruby>.
Article Layout and Navigation
Three-Column Shell
The article page uses a CSS grid with three columns:
[ TOC (sticky) ] [ prose (680px measure) ] [ empty right margin ]
The right column is intentionally empty — no widgets, no related posts, no ads. Hierarchy is communicated through spacing and typography alone, following a print-editorial model.
Table of Contents
The TableOfContents component extracts h2 and h3 headings from the MDX render tree and renders an accessible <nav>:
- Desktop (≥ 720px): TOC stays visible in the left rail as secondary navigation
- Mobile (< 720px): TOC is hidden so the reader reaches title and prose immediately
- Active section: an
IntersectionObservertracks which heading is in the top 45% of the viewport and highlights the corresponding TOC link in amber - Smooth scroll: TOC link clicks call
scrollIntoView()and push the hash tohistorywithout a hard jump
Reading Time
Reading time is estimated with a CJK-aware regex-based algorithm:
export function estimateReadingTime(text: string): number {
const cjkChars = text.match(/[\u3000-\u9FFF\uF900-\uFAFF]/gu)?.length ?? 0;
const latinWords = text.replace(/[\u3000-\u9FFF\uF900-\uFAFF]/gu, ' ').match(/\S+/g)?.length ?? 0;
// CJK reading ~400 chars/min, Latin ~200 words/min
const minutes = cjkChars / 400 + latinWords / 200;
return Math.max(1, Math.ceil(minutes));
}
The implementation keeps the rules simple: CJK characters are counted directly, non-CJK tokens are counted with a whitespace split, and the result is rounded up to a minimum of one minute.
Tag System
Tags are normalised into canonical slugs so that CJK tags survive the transformation and punctuation-heavy names remain routable:
export function canonicalizeTagSlug(tag: string): string {
const normalized = tag.normalize('NFKC').trim().toLowerCase();
const base = normalized
.replace(/\s+/g, '-')
.replace(/[^\p{L}\p{N}-]/gu, '')
.replace(/-+/g, '-')
.replace(/^-+|-+$/g, '');
return base || `tag-${hashText(normalized || tag)}`;
}
Unicode property escapes keep 機械学習 intact, while punctuation cleanup and hyphen collapsing normalise names like TypeScript / JS to a stable canonical route.
When every character is stripped (e.g. a tag consisting entirely of punctuation), the function falls back to an FNV-1a hash to guarantee a routable slug.
Tag archive pages are generated at /tags/[tag] via getStaticPaths.
The buildTagRoutes utility centralises aggregation so the topic directory and per-tag listing pages share exactly the same routing model:
export function buildTagRoutes(posts: BlogPost[]): TagRoute[] {
// Canonical slugs are collision-safe and deterministic.
}
Full-Text Search with Pagefind
Search is powered by Pagefind, a static-site search library that runs entirely at the edge with no external service.
Custom segmented-pagefind Integration
Rather than using the off-the-shelf astro-pagefind npm package, the site ships a bespoke integration (src/integrations/segmented-pagefind.ts) that solves a fundamental limitation: Pagefind's WASM tokenizer was designed for space-delimited languages and does not recognise Japanese word boundaries.
Indexing raw Japanese HTML would produce a single unsplittable token per sentence, making any non-trivial query unreliable.
The integration works in two phases:
Build phase (astro:build:done):
- After
astro buildproduces static HTML, the integration scans the output and identifies only article pages (matched against thesrc/content/blog/source files). - For each Japanese article (
<html lang="ja">),buildSyntheticSearchHtmlis called: JSDOM parses the page, every text node inside[data-pagefind-body]that contains Han/Hiragana/Katakana characters is visited, andIntl.Segmenterinserts spaces at morpheme boundaries. - The rewritten HTML — with
langforced toenso Pagefind routes through its Latin tokenizer — is fed to Pagefind's Node.js API (createIndex/addHTMLFile). - A single unified index is written to
dist/pagefind/.
// Segment Japanese text before handing it to Pagefind's Latin tokenizer
export function segmentJapaneseSearchText(value: string): string {
let result = '';
let lastWasSearchable = false;
for (const segment of SEGMENTER.segment(value)) {
const isSearchable =
segment.isWordLike ||
JAPANESE_SCRIPT_PATTERN.test(segment.segment) ||
WORDLIKE_PATTERN.test(segment.segment);
if (isSearchable && lastWasSearchable && !result.endsWith(' ')) result += ' ';
result += segment.segment;
lastWasSearchable = isSearchable;
}
return result;
}
Dev-server phase (astro:server:setup):
A Vite dev-server middleware intercepts /pagefind/* requests and serves files from the last written index directory, so search remains functional during astro dev without re-indexing on every hot reload.
The article body is annotated data-pagefind-body in ArticleLayout.astro:
<article class="prose" data-pagefind-body lang={lang}></article>
This scopes the index to article content, excluding header, footer, and sidebar noise.
On the client, Pagefind's WASM engine loads the unified index from the same edge node that served the page — no extra round-trips, no Worker-side search service.
Typography
Fraunces Variable Font
Display headings use Fraunces, a variable optical-size serif with a distinctively soft character.
Both files (Fraunces-Variable.woff2, Fraunces-Italic-Variable.woff2) are self-hosted under public/fonts/ and declared with font-display: optional, so first paint falls back immediately while repeat visits reuse the cached face.
Font Stack
--font-display:
'Fraunces', 'Hiragino Mincho ProN', 'Yu Mincho', 'Noto Serif JP', Georgia, 'Noto Sans JP', serif;
--font-sans:
'Noto Sans JP', 'Hiragino Kaku Gothic ProN', 'Yu Gothic UI', 'Meiryo', system-ui, sans-serif;
--font-mono:
'JetBrains Mono', ui-monospace, 'Fira Code', 'Cascadia Code', 'Noto Sans JP',
'Hiragino Kaku Gothic ProN', 'Yu Gothic UI', 'Meiryo', monospace;
Japanese mincho fallbacks are placed before Georgia in the display stack so that CJK characters in headings resolve to a matching serif/mincho face.
Japanese sans-serif fonts are listed before system-ui so that CJK body text resolves correctly on machines that have only local Japanese fonts installed.
Japanese Typography Features
The blog has first-class support for Japanese composition:
<ruby>annotations via theRubyTextcomponent — e.g. 情報処理line-break: strict— prevents breaks before small kana (ぁぃぅぇぉ…) and other restricted characterstext-wrap: pretty— multi-line headings break at semantically reasonable pointsfont-feature-settings: 'palt'— proportional alternate spacing for punctuation (。、…), applied to headings globally and to the prose body whenlang="ja", tightening the visual rhythm of CJK texttext-autospace: ideograph-alpha— automatic thin-space insertion between CJK and Latin characters, soAstro 6embedded in Japanese prose reads naturally
Amber Accent
The single accent colour throughout the UI is amber — #d4820a for decorative elements and #b45309 for text and interactive elements (4.5:1 contrast, WCAG AA).
It appears on the rule below article titles, TOC active-section indicators, tag links, and dates.
Everything else is ink (#1c1a18) on paper (#f9f8f5).
Developer Experience
TypeScript in Strict Mode
tsconfig.json enables strict: true.
All content collection types flow from CollectionEntry<'blog'> through ProcessedPost into each page component — there is no any in the data path.
Biome and Prettier
Linting and formatting use two tools with complementary scopes:
| Tool | Target |
|---|---|
| Biome | *.ts, *.tsx — lint + format |
Prettier + prettier-plugin-astro | *.astro — format only |
Biome handles TypeScript linting and formatting in a single binary; Prettier covers Astro template formatting via prettier-plugin-astro.
Husky Pre-commit Hooks
lint-staged runs automatically on git commit via Husky:
"lint-staged": {
"src/**/*.{ts,tsx}": ["biome check --write", "biome format --write"],
"src/**/*.astro": ["prettier --write"],
"*.{json,md}": ["prettier --write"]
}
This guarantees that no unformatted or unlinted code reaches the repository, without interrupting work on in-progress files.
Testing
| Suite | Tool | Scope |
|---|---|---|
| Unit | Vitest 4 | estimateReadingTime, canonical tag routing, TagResolver, formatDate |
| End-to-end | Playwright | Route smoke, search modal bootstrap, tag navigation |
Vitest shares the Vite config and runs in the node environment, so TypeScript compilation goes through the same pipeline as the main build.
Playwright is a smoke suite that runs against the built dist/ artifact (downloaded from the CI build job artifact), while browser interaction logic is covered in Vitest with jsdom.
During astro dev, Pagefind search reuses the last written index, so run pnpm build after content changes to get fresh search results.
Deployment
Cloudflare Workers
The site is delivered to Cloudflare's global network as a named Worker
(length3) — there is no Cloudflare Pages project. The repository itself keeps
the Wrangler-compatible output, while production rollout is handled by
Cloudflare Workers Builds.
There is no hand-authored wrangler.toml in the repository either; the @astrojs/cloudflare adapter generates dist/server/wrangler.json at build time, derived directly from the Astro config.
The generated config is still what local verification and emergency manual
deploys run against:
pnpm build
pnpm exec wrangler deploy --config dist/server/wrangler.json
The adapter is configured with prerenderEnvironment: 'node' so that Astro's prerender phase runs under Node.js (faster, full API surface) while the final Worker bundle remains edge-native:
adapter: cloudflare({
imageService: 'passthrough',
prerenderEnvironment: 'node',
})
The Vite SSR target is set to 'webworker' so Node.js-specific modules are excluded from the bundle.
Pagefind's WASM module and the Cloudflare adapter server entrypoint are excluded from Vite's dep-optimizer to prevent double-bundling:
vite: {
ssr: {
target: 'webworker',
},
optimizeDeps: {
exclude: ['pagefind', '@astrojs/cloudflare/entrypoints/server'],
},
build: {
chunkSizeWarningLimit: 1024,
},
}
Two build options keep output tidy and fast: format: 'file' generates flat slug.html files (no slug/index.html nesting), and concurrency: 4 parallelises page rendering across the static build.
Because the site is fully prerendered and never uses runtime sessions, the Cloudflare adapter's default KV session provisioning is bypassed with an in-process LRU driver:
session: {
driver: sessionDrivers.lruCache(),
}
This prevents Wrangler from requiring a KV namespace binding during wrangler deploy for a site that has no dynamic routes.
CI/CD
GitHub Actions runs two workflows:
ci.yml— on every push and PR, three jobs: (1) lint + type check + unit tests (biome check,astro check,vitest run), (2) build (uploadsdist/as an artifact), (3) Playwright E2E (downloads the build artifact, runsplaywright test). The build and E2E jobs are chained so Playwright always tests the exact build that was just produced.- Production rollout is handled by Cloudflare Workers Builds, so this repository no longer carries a separate deploy workflow.
Because Pagefind indexing is handled inside the astro:build:done hook, the search index is always produced as part of the single pnpm build step — no separate post-processing pass required.
Design Principles
Length³ is designed as a reading-first interface. Rather than feeling like a typical web application, it leans toward an editorial, print-like structure:
- Restrained typography — two weights of two typefaces, one accent colour
- Generous whitespace — hierarchy communicated through spacing alone, not decorative borders
- No sidebar on article pages — the right column is intentionally left empty
- 680 px prose measure — optimised for a comfortable line length across all viewport sizes
The result is an interface whose character comes entirely from spacing and type treatment, not from decoration.