2026/03/20 upd. 2026/03/29 12 min read

Length³: A Technical Blog Built on Astro 6 and Cloudflare Workers

A deep dive into how this blog is built — Astro 6, MDX content collections, Pagefind search, and Japanese typography, all deployed at the edge.

Length³ is a technical blog engineered as a portfolio piece: every architectural decision is deliberate, every dependency earns its place, and the reading experience is treated as a first-class concern. This post documents the full stack — from content pipeline to edge deployment — so you can understand what was built and why.

Architecture Overview

The stack has three layers:

Layer	Technology
Framework	Astro 6 (static output + Cloudflare adapter)
Content	MDX + Content Layer API (Zod-validated)
Runtime	Cloudflare Workers (edge-native)

The blog is statically generated at build time (output: 'static'). Static assets — HTML pages, fonts, and the Pagefind index — are bundled into the Worker and served from Cloudflare's edge network with no origin server.

Content System

Content Layer API and Zod Schema

Every article lives in src/content/blog/ as either .md or .mdx. The collection is defined with a strict Zod schema:

const blog = defineCollection({
  loader: glob({
    pattern: '**/*.{md,mdx}',
    base: './src/content/blog',
  }),
  schema: z.object({
    title: z.string(),
    description: z.string(),
    publishDate: z.coerce.date(),
    updatedDate: z.coerce.date().optional(),
    tags: z.array(z.string()).default([]),
    draft: z.boolean().default(false),
    lang: z.enum(['en', 'ja']).default('en'),
  }),
});

z.coerce.date() lets you write 2026-03-28 in frontmatter and receive a native Date object in templates — no manual parsing. draft: true excludes a post from all listing pages via a shared predicate:

export const notDraft = (entry: BlogPost) => !entry.data.draft;
// Usage:
const allPosts = await getCollection('blog', notDraft);

Frontmatter Fields

Field	Type	Purpose
`title`	`string`	Required. Article heading and `<title>`
`description`	`string`	Required. Lead paragraph and meta description
`publishDate`	`YYYY-MM-DD`	Required. Shown in article header and sort order
`updatedDate`	`YYYY-MM-DD`	Optional. Shown as "(upd. …)" next to publish date
`tags`	`string[]`	Routed to `/tags/[slug]`, CJK-safe slugification
`draft`	`boolean`	Defaults `false`. `true` hides the post from all listings
`lang`	`'en' \| 'ja'`	Controls date format and `lang` attribute

MDX and Component Imports

MDX lets you import Astro components directly into article prose. The RubyText component, for example, wraps a base text and its phonetic reading into a proper <ruby> element:

---
interface Props {
  base: string;
  reading: string;
}
const { base, reading } = Astro.props;
---

<ruby>{base}<rp>(</rp><rt>{reading}</rt><rp>)</rp></ruby>

In an article you write:

import RubyText from '../../components/RubyText.astro';

<RubyText base="東京" reading="とうきょう" />
は日本の首都です。

Which renders as: 東京(とうきょう)は日本の首都です。

The <rp> fallback parentheses are included so the annotation degrades gracefully in browsers or screen readers that do not support <ruby>.

Three-Column Shell

The article page uses a CSS grid with three columns:

[ TOC (sticky) ] [ prose (680px measure) ] [ empty right margin ]

The right column is intentionally empty — no widgets, no related posts, no ads. Hierarchy is communicated through spacing and typography alone, following a print-editorial model.

The TableOfContents component extracts h2 and h3 headings from the MDX render tree and renders an accessible <nav>:

Desktop (≥ 720px): TOC stays visible in the left rail as secondary navigation
Mobile (< 720px): TOC is hidden so the reader reaches title and prose immediately
Active section: an IntersectionObserver tracks which heading is in the top 45% of the viewport and highlights the corresponding TOC link in amber
Smooth scroll: TOC link clicks call scrollIntoView() and push the hash to history without a hard jump

Reading Time

Reading time is estimated with a CJK-aware regex-based algorithm:

export function estimateReadingTime(text: string): number {
  const cjkChars = text.match(/[\u3000-\u9FFF\uF900-\uFAFF]/gu)?.length ?? 0;
  const latinWords = text.replace(/[\u3000-\u9FFF\uF900-\uFAFF]/gu, ' ').match(/\S+/g)?.length ?? 0;

  // CJK reading ~400 chars/min, Latin ~200 words/min
  const minutes = cjkChars / 400 + latinWords / 200;
  return Math.max(1, Math.ceil(minutes));
}

The implementation keeps the rules simple: CJK characters are counted directly, non-CJK tokens are counted with a whitespace split, and the result is rounded up to a minimum of one minute.

Tag System

Tags are normalised into canonical slugs so that CJK tags survive the transformation and punctuation-heavy names remain routable:

export function canonicalizeTagSlug(tag: string): string {
  const normalized = tag.normalize('NFKC').trim().toLowerCase();
  const base = normalized
    .replace(/\s+/g, '-')
    .replace(/[^\p{L}\p{N}-]/gu, '')
    .replace(/-+/g, '-')
    .replace(/^-+|-+$/g, '');

  return base || `tag-${hashText(normalized || tag)}`;
}

Unicode property escapes keep 機械学習 intact, while punctuation cleanup and hyphen collapsing normalise names like TypeScript / JS to a stable canonical route. When every character is stripped (e.g. a tag consisting entirely of punctuation), the function falls back to an FNV-1a hash to guarantee a routable slug.

Tag archive pages are generated at /tags/[tag] via getStaticPaths. The buildTagRoutes utility centralises aggregation so the topic directory and per-tag listing pages share exactly the same routing model:

export function buildTagRoutes(posts: BlogPost[]): TagRoute[] {
  // Canonical slugs are collision-safe and deterministic.
}

Full-Text Search with Pagefind

Search is powered by Pagefind, a static-site search library that runs entirely at the edge with no external service.

Custom `segmented-pagefind` Integration

Rather than using the off-the-shelf astro-pagefind npm package, the site ships a bespoke integration (src/integrations/segmented-pagefind.ts) that solves a fundamental limitation: Pagefind's WASM tokenizer was designed for space-delimited languages and does not recognise Japanese word boundaries. Indexing raw Japanese HTML would produce a single unsplittable token per sentence, making any non-trivial query unreliable.

The integration works in two phases:

Build phase (astro:build:done):

After astro build produces static HTML, the integration scans the output and identifies only article pages (matched against the src/content/blog/ source files).
For each Japanese article (<html lang="ja">), buildSyntheticSearchHtml is called: JSDOM parses the page, every text node inside [data-pagefind-body] that contains Han/Hiragana/Katakana characters is visited, and Intl.Segmenter inserts spaces at morpheme boundaries.
The rewritten HTML — with lang forced to en so Pagefind routes through its Latin tokenizer — is fed to Pagefind's Node.js API (createIndex / addHTMLFile).
A single unified index is written to dist/pagefind/.

// Segment Japanese text before handing it to Pagefind's Latin tokenizer
export function segmentJapaneseSearchText(value: string): string {
  let result = '';
  let lastWasSearchable = false;
  for (const segment of SEGMENTER.segment(value)) {
    const isSearchable =
      segment.isWordLike ||
      JAPANESE_SCRIPT_PATTERN.test(segment.segment) ||
      WORDLIKE_PATTERN.test(segment.segment);
    if (isSearchable && lastWasSearchable && !result.endsWith(' ')) result += ' ';
    result += segment.segment;
    lastWasSearchable = isSearchable;
  }
  return result;
}

Dev-server phase (astro:server:setup):

A Vite dev-server middleware intercepts /pagefind/* requests and serves files from the last written index directory, so search remains functional during astro dev without re-indexing on every hot reload.

The article body is annotated data-pagefind-body in ArticleLayout.astro:

<article class="prose" data-pagefind-body lang={lang}></article>

This scopes the index to article content, excluding header, footer, and sidebar noise.

On the client, Pagefind's WASM engine loads the unified index from the same edge node that served the page — no extra round-trips, no Worker-side search service.

Typography

Fraunces Variable Font

Display headings use Fraunces, a variable optical-size serif with a distinctively soft character. Both files (Fraunces-Variable.woff2, Fraunces-Italic-Variable.woff2) are self-hosted under public/fonts/ and declared with font-display: optional, so first paint falls back immediately while repeat visits reuse the cached face.

Font Stack

--font-display:
  'Fraunces', 'Hiragino Mincho ProN', 'Yu Mincho', 'Noto Serif JP', Georgia, 'Noto Sans JP', serif;

--font-sans:
  'Noto Sans JP', 'Hiragino Kaku Gothic ProN', 'Yu Gothic UI', 'Meiryo', system-ui, sans-serif;

--font-mono:
  'JetBrains Mono', ui-monospace, 'Fira Code', 'Cascadia Code', 'Noto Sans JP',
  'Hiragino Kaku Gothic ProN', 'Yu Gothic UI', 'Meiryo', monospace;

Japanese mincho fallbacks are placed before Georgia in the display stack so that CJK characters in headings resolve to a matching serif/mincho face. Japanese sans-serif fonts are listed before system-ui so that CJK body text resolves correctly on machines that have only local Japanese fonts installed.

Japanese Typography Features

The blog has first-class support for Japanese composition:

<ruby> annotations via the RubyText component — e.g. 情報処理(じょうほうしょり)
line-break: strict — prevents breaks before small kana (ぁぃぅぇぉ…) and other restricted characters
text-wrap: pretty — multi-line headings break at semantically reasonable points
font-feature-settings: 'palt' — proportional alternate spacing for punctuation (。、…), applied to headings globally and to the prose body when lang="ja", tightening the visual rhythm of CJK text
text-autospace: ideograph-alpha — automatic thin-space insertion between CJK and Latin characters, so Astro 6 embedded in Japanese prose reads naturally

Amber Accent

The single accent colour throughout the UI is amber — #d4820a for decorative elements and #b45309 for text and interactive elements (4.5:1 contrast, WCAG AA). It appears on the rule below article titles, TOC active-section indicators, tag links, and dates. Everything else is ink (#1c1a18) on paper (#f9f8f5).

Developer Experience

TypeScript in Strict Mode

tsconfig.json enables strict: true. All content collection types flow from CollectionEntry<'blog'> through ProcessedPost into each page component — there is no any in the data path.

Biome and Prettier

Linting and formatting use two tools with complementary scopes:

Tool	Target
Biome	`.ts`, `.tsx` — lint + format
Prettier + `prettier-plugin-astro`	`*.astro` — format only

Biome handles TypeScript linting and formatting in a single binary; Prettier covers Astro template formatting via prettier-plugin-astro.

Husky Pre-commit Hooks

lint-staged runs automatically on git commit via Husky:

"lint-staged": {
  "src/**/*.{ts,tsx}":  ["biome check --write", "biome format --write"],
  "src/**/*.astro":     ["prettier --write"],
  "*.{json,md}":        ["prettier --write"]
}

This guarantees that no unformatted or unlinted code reaches the repository, without interrupting work on in-progress files.

Testing

Suite	Tool	Scope
Unit	Vitest 4	`estimateReadingTime`, canonical tag routing, `TagResolver`, `formatDate`
End-to-end	Playwright	Route smoke, search modal bootstrap, tag navigation

Vitest shares the Vite config and runs in the node environment, so TypeScript compilation goes through the same pipeline as the main build. Playwright is a smoke suite that runs against the built dist/ artifact (downloaded from the CI build job artifact), while browser interaction logic is covered in Vitest with jsdom. During astro dev, Pagefind search reuses the last written index, so run pnpm build after content changes to get fresh search results.

Deployment

Cloudflare Workers

The site is delivered to Cloudflare's global network as a named Worker (length3) — there is no Cloudflare Pages project. The repository itself keeps the Wrangler-compatible output, while production rollout is handled by Cloudflare Workers Builds. There is no hand-authored wrangler.toml in the repository either; the @astrojs/cloudflare adapter generates dist/server/wrangler.json at build time, derived directly from the Astro config. The generated config is still what local verification and emergency manual deploys run against:

pnpm build
pnpm exec wrangler deploy --config dist/server/wrangler.json

The adapter is configured with prerenderEnvironment: 'node' so that Astro's prerender phase runs under Node.js (faster, full API surface) while the final Worker bundle remains edge-native:

adapter: cloudflare({
  imageService: 'passthrough',
  prerenderEnvironment: 'node',
})

The Vite SSR target is set to 'webworker' so Node.js-specific modules are excluded from the bundle. Pagefind's WASM module and the Cloudflare adapter server entrypoint are excluded from Vite's dep-optimizer to prevent double-bundling:

vite: {
  ssr: {
    target: 'webworker',
  },
  optimizeDeps: {
    exclude: ['pagefind', '@astrojs/cloudflare/entrypoints/server'],
  },
  build: {
    chunkSizeWarningLimit: 1024,
  },
}

Two build options keep output tidy and fast: format: 'file' generates flat slug.html files (no slug/index.html nesting), and concurrency: 4 parallelises page rendering across the static build.

Because the site is fully prerendered and never uses runtime sessions, the Cloudflare adapter's default KV session provisioning is bypassed with an in-process LRU driver:

session: {
  driver: sessionDrivers.lruCache(),
}

This prevents Wrangler from requiring a KV namespace binding during wrangler deploy for a site that has no dynamic routes.

CI/CD

GitHub Actions runs two workflows:

ci.yml — on every push and PR, three jobs: (1) lint + type check + unit tests (biome check, astro check, vitest run), (2) build (uploads dist/ as an artifact), (3) Playwright E2E (downloads the build artifact, runs playwright test). The build and E2E jobs are chained so Playwright always tests the exact build that was just produced.
Production rollout is handled by Cloudflare Workers Builds, so this repository no longer carries a separate deploy workflow.

Because Pagefind indexing is handled inside the astro:build:done hook, the search index is always produced as part of the single pnpm build step — no separate post-processing pass required.

Design Principles

Length³ is designed as a reading-first interface. Rather than feeling like a typical web application, it leans toward an editorial, print-like structure:

Restrained typography — two weights of two typefaces, one accent colour
Generous whitespace — hierarchy communicated through spacing alone, not decorative borders
No sidebar on article pages — the right column is intentionally left empty
680 px prose measure — optimised for a comfortable line length across all viewport sizes

The result is an interface whose character comes entirely from spacing and type treatment, not from decoration.

Length³: A Technical Blog Built on Astro 6 and Cloudflare Workers

Architecture Overview

Content System

Content Layer API and Zod Schema

Frontmatter Fields

MDX and Component Imports

Article Layout and Navigation

Three-Column Shell

Table of Contents

Reading Time

Tag System

Full-Text Search with Pagefind

Custom `segmented-pagefind` Integration

Typography

Fraunces Variable Font

Font Stack

Japanese Typography Features

Amber Accent

Developer Experience

TypeScript in Strict Mode

Biome and Prettier

Husky Pre-commit Hooks

Testing

Deployment

Cloudflare Workers

CI/CD

Design Principles

Architecture Overview

Content System

Content Layer API and Zod Schema

Frontmatter Fields

MDX and Component Imports

Article Layout and Navigation

Three-Column Shell

Table of Contents

Reading Time

Tag System

Full-Text Search with Pagefind

Custom segmented-pagefind Integration

Typography

Fraunces Variable Font

Font Stack

Japanese Typography Features

Amber Accent

Developer Experience

TypeScript in Strict Mode

Biome and Prettier

Husky Pre-commit Hooks

Testing

Deployment

Cloudflare Workers

CI/CD

Design Principles

Custom `segmented-pagefind` Integration