Building Durable AI Agents with Cloudflare Project Think
On June 16, 2026, Cloudflare shipped Agents SDK v0.16.1 and with it Project Think, a durable runtime for long-running agents.

Introduction
On June 16, 2026, Cloudflare shipped Agents SDK v0.16.1 and with it Project Think, a durable runtime for long-running agents. The headline primitive is the fiber: a unit of agent work that checkpoints its own progress to a local SQLite database and resumes from the last checkpoint after a crash. This tutorial builds a research agent that survives a worker restart mid-run instead of starting the whole job over.
If you have ever shipped an agent that ran for two minutes, hit a deploy, and lost everything it had done, this is the part you were missing.
What durable execution actually means
Durable execution means the progress of a long task is persisted as the task runs, not after it finishes. A normal request is stateless. If the process dies at 80%, the 80% is gone, and the only honest thing the system can do is run all of it again.
Durable execution flips that. The runtime records where the work got to, in storage that outlives the process. When the process comes back, it reads the last recorded point and continues. The unit of work becomes resumable rather than disposable. That is the whole idea, and everything in Project Think is an implementation detail underneath it.
What a fiber is in the Agents SDK
In Project Think, a fiber is a durable invocation that can checkpoint its own instruction pointer. You wrap a block of agent work in runFiber(name, callback). Inside the callback you call ctx.stash(data) to persist intermediate state to the agent's co-located SQLite database. If the worker restarts mid-fiber, the SDK recovers the fiber, fires onFiberRecovered(ctx), and hands you back the stashed state so you can resume from the last checkpoint instead of the beginning.
The Agents SDK runs on Cloudflare Workers backed by Durable Objects, which is why each agent gets its own SQLite instance sitting next to its code. The fiber writes there. No external queue, no separate database to provision.
Scaffold a Think agent
Start with the base class. Think<Env> is the opinionated harness from @cloudflare/think; the one method you must override is getModel(). Install the current packages first.
npm i agents@latest @cloudflare/think@latest ai @cloudflare/shell zod workers-ai-provider// src/agent.ts
import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";
interface Env {
AI: Ai;
}
export class ResearchAgent extends Think<Env> {
getModel() {
const workersai = createWorkersAI({ binding: this.env.AI });
return workersai("@cf/meta/llama-3.3-70b-instruct");
}
}That is a complete, deployable agent with a chat lifecycle already wired in. It is also not yet durable. A long task inside it is still disposable. We fix that next.
Run the work inside a fiber
Wrap the long task in runFiber. The first argument names the fiber so the runtime can find it again after a restart. Everything inside the callback is now a recoverable
async research(topic: string) {
return this.runFiber("research", async (ctx) => {
const subtopics = await this.plan(topic);
const findings: string[] = [];
for (let i = 0; i < subtopics.length; i++) {
const answer = await this.callModel(subtopics[i]);
findings.push(answer);
}
return findings;
});
}This runs. It is also still fragile, because the work inside the loop holds its progress only in the findings array in memory. A restart at subtopic four throws away subtopics one through three. The fiber wrapper alone does not save you. The checkpoint does.
Checkpoint progress with ctx.stash
ctx.stash writes the current state to SQLite before the next expensive step. Call it after each unit of progress you would hate to repeat. Here every completed subtopic is stashed with the index, so the recovered run knows exactly how far it got.
async research(topic: string) {
return this.runFiber("research", async (ctx) => {
const subtopics = await this.plan(topic);
const findings: string[] = [];
for (let i = 0; i < subtopics.length; i++) {
const answer = await this.callModel(subtopics[i]);
findings.push(answer);
// persist after each completed unit, before the next model call
ctx.stash({ topic, step: i + 1, findings });
}
return findings;
});
}The stash is the load-bearing line. The model call above it is the expensive part; the stash below it is the receipt. After the loop has run four times, SQLite holds four findings and step: 4. That record is what a recovery reads.
Recover after a crash with onFiberRecovered
When the worker restarts mid-fiber, the runtime re-enters through onFiberRecovered. You read the stashed state and resume from step rather than from zero. Recovery in v0.16.1 runs with backoff, so a worker that keeps dying does not hot-loop.
async onFiberRecovered(ctx) {
const state = ctx.stashed as {
topic: string;
step: number;
findings: string[];
};
const subtopics = await this.plan(state.topic);
const findings = state.findings;
// resume at the step after the last checkpoint
for (let i = state.step; i < subtopics.length; i++) {
findings.push(await this.callModel(subtopics[i]));
ctx.stash({ topic: state.topic, step: i + 1, findings });
}
return findings;
}The recovered run does no work twice. It re-plans the subtopics (cheap and deterministic), then picks up at the first index it never finished. Three completed subtopics cost three model calls total across the crash, not six.
Keep long jobs alive past the limits
Workers reclaim idle execution. For a fiber that legitimately runs for minutes, wrap the long stretch in keep alive While so the runtime does not collect it mid task.
async deepResearch(topic: string) {
return this.runFiber("deep-research", async (ctx) => {
return this.keepAliveWhile(async () => {
const subtopics = await this.plan(topic);
const findings: string[] = [];
for (let i = 0; i < subtopics.length; i++) {
findings.push(await this.callModel(subtopics[i]));
ctx.stash({ topic, step: i + 1, findings });
}
return findings;
});
});
}A plain Worker request vs a Project Think fiber
| Property | Plain Worker request | Project Think fiber |
|---|---|---|
| State on crash at 80% | lost, rerun from 0 | recovered from last ctx.stash |
| Where progress lives | in-memory only | co-located SQLite (Durable Object) |
| Restart behavior | new invocation, no memory | onFiberRecovered with backoff |
| Long-task survival | subject to idle reclaim | held by keepAliveWhile |
| Extra infra to set up | a queue and a database | none |
What this does not do
A fiber checkpoints state, not side effects. If a model call already sent an email, posted a PR, or charged a card before the crash, recovery will not un-send it, and naive resume logic will do it twice. Idempotency is still your job, not the runtime's. The honest version of durable execution is this: it guarantees your state survives, and it guarantees nothing about whether the outside world can tell the difference between your first attempt and your second.
There is also no magic in the granularity. You only recover to the last ctx.stash you wrote. Stash too rarely and recovery is cheap to write but expensive to run. Stash on every token and you have built a slow agent that is very good at remembering how slow it is
The full working agent
// src/agent.ts
import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";
interface Env {
AI: Ai;
}
export class ResearchAgent extends Think<Env> {
getModel() {
const workersai = createWorkersAI({ binding: this.env.AI });
return workersai("@cf/meta/llama-3.3-70b-instruct");
}
async plan(topic: string): Promise<string[]> {
const model = this.getModel();
const { text } = await model.doGenerate({
prompt: `List 4 research subtopics for "${topic}". One per line, no numbering.`,
});
return text.split("\n").map((s) => s.trim()).filter(Boolean).slice(0, 4);
}
async callModel(subtopic: string): Promise<string> {
const model = this.getModel();
const { text } = await model.doGenerate({
prompt: `Write one tight paragraph of findings on: ${subtopic}`,
});
return text.trim();
}
async research(topic: string) {
return this.runFiber("research", async (ctx) => {
return this.keepAliveWhile(async () => {
const subtopics = await this.plan(topic);
const findings: string[] = [];
for (let i = 0; i < subtopics.length; i++) {
findings.push(await this.callModel(subtopics[i]));
ctx.stash({ topic, step: i + 1, findings });
}
return findings;
});
});
}
async onFiberRecovered(ctx) {
const state = ctx.stashed as {
topic: string;
step: number;
findings: string[];
};
const subtopics = await this.plan(state.topic);
const findings = state.findings;
for (let i = state.step; i < subtopics.length; i++) {
findings.push(await this.callModel(subtopics[i]));
ctx.stash({ topic: state.topic, step: i + 1, findings });
}
return findings;
}
}Deploy it with wrangler deploy, bind Workers AI in wrangler.toml, and trigger research() from a fetch handler. Kill the worker mid-run and watch it pick up where it left off.
When to reach for it
Use fibers when the cost of redoing work is real: multi-step research, batch processing, anything where a single run spans more model calls than you want to pay for twice. For a request that finishes in 200 milliseconds, this is overhead you do not need; a plain handler is the right tool and a fiber is a costume. The line is simple. If losing the run halfway through would make you wince, the work wanted to be durable before it crashed, and now it can be.
You might also like
Keep reading from the journal.
June 19, 2026AI
Build a Text-to-Image Generation Service with the Reve 2.0 API
On June 3, 2026, Reve 2.0 jumped to second place on the text-to-image Arena leaderboard. A leaderboard position is a benchmark, not a product.
June 19, 2026AI
When engagement is really confusion
A founder thought nobody used his analytics dashboard enough. The real bug was that it never told anyone what to do. Showing data cleanly used to be the hard part. Now anyone can wire numbers into a model over a weekend. The clean chart is table stakes. The answer is the product.
June 18, 2026AI
The Demo Was the Easy Part
The AI demo that raised your round is the cheapest thing you will build all year. The model is a weekend; the product is a year of auth, billing, and retries that never demo. Here is how to sort the roadmap so the founder only builds the part that only the founder can.