<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet href="/rss.xsl" type="text/xsl"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Florian Sabani</title><description>Software Engineer | Tech Entrepreneur | Cloud Specialist</description><link>https://floriansabani.com</link><item><title>Give Your Coding Agent Eyes: Cloudflare Skills, Observability MCP, and Local-First TDD</title><link>https://floriansabani.com/en/posts/give-your-coding-agent-eyes</link><guid isPermaLink="true">https://floriansabani.com/en/posts/give-your-coding-agent-eyes</guid><pubDate>Sat, 04 Jul 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Coding agents are tireless and fast — and blind by default. This post is about the two feedback loops I wired into my Cloudflare Workers project so that Claude Code can &lt;em&gt;see&lt;/em&gt; what my code does: production logs it can query itself, and a local test suite that simulates the entire platform — Durable Objects, SQLite, R2, third-party APIs — in seconds. It&apos;s the closest thing to a silver bullet I&apos;ve found for agentic coding.&lt;/p&gt;
&lt;h2&gt;The blind sculptor&lt;/h2&gt;
&lt;p&gt;I recently watched a video by Salvatore Sanfilippo (antirez) — &lt;a href=&quot;https://youtu.be/TJ6ruN-o0PA&quot;&gt;&lt;em&gt;&quot;Il trucco decisivo (davvero) per lavorare coi coding agent&quot;&lt;/em&gt;&lt;/a&gt; — that puts into words something I had been circling around for months. Full credit to him for the framing; if you understand Italian, go watch it.&lt;/p&gt;
&lt;p&gt;His argument goes like this. You&apos;ve heard all the standard advice about coding agents: write precise specs, share your design intuitions in non-binding language, keep the codebase clean, comment the &lt;em&gt;tensions&lt;/em&gt; in the code and not just the mechanics. All true, all useful. But there&apos;s one property of LLM agents that almost nobody talks about, and it&apos;s the one that changes everything: &lt;strong&gt;tenacity&lt;/strong&gt;. An agent will try, and retry, and retry again, at a speed no human can match. Each failed attempt costs it seconds, not an afternoon of motivation.&lt;/p&gt;
&lt;p&gt;Then comes his metaphor, which I can&apos;t stop thinking about. Imagine a tireless worker in front of a block of marble. He can even travel back in time: chip the marble wrong, rewind, try again, forever. His tools are crude — he can&apos;t carve like Michelangelo, he can only throw stones — but he never stops and never gets tired. Given enough attempts, he&apos;ll get somewhere remarkable.&lt;/p&gt;
&lt;p&gt;Unless he&apos;s blind.&lt;/p&gt;
&lt;p&gt;If the worker can&apos;t &lt;em&gt;see&lt;/em&gt; the marble, no amount of tenacity or time travel helps. His attempts aren&apos;t informed by the results of the previous ones. He&apos;s just throwing stones into the dark.&lt;/p&gt;
&lt;p&gt;That&apos;s your coding agent without feedback loops. And that&apos;s why I&apos;ve stopped optimizing my prompts and started optimizing my agent&apos;s &lt;em&gt;senses&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;Two kinds of sight&lt;/h2&gt;
&lt;p&gt;A coding agent needs to see two different things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;What the code actually did&lt;/strong&gt; — production behavior: errors, logs, timelines, the request that failed at 11:51 and everything that happened around it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What the code will do&lt;/strong&gt; — the consequences of the change it just made, before it ships: does the flow still work, did the database end up in the right state, did we call the third-party API the way we think we did.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;On Cloudflare, both of these are now things the agent can operate &lt;em&gt;by itself&lt;/em&gt;, without me clicking through dashboards or babysitting a staging environment. The first comes from &lt;a href=&quot;https://developers.cloudflare.com/agent-setup/claude-code/&quot;&gt;Cloudflare&apos;s skills and MCP servers&lt;/a&gt;; the second from &lt;code&gt;@cloudflare/vitest-pool-workers&lt;/code&gt; and a deliberately local-first test architecture.&lt;/p&gt;
&lt;p&gt;Let me show you both, with real (lightly anonymized) material from my project: a multi-tenant platform on Workers that integrates with crypto exchanges — Hono API, Durable Objects with SQLite, R2, D1, drizzle-orm, the works.&lt;/p&gt;
&lt;h2&gt;Part 1: Let the agent read production&lt;/h2&gt;
&lt;h3&gt;Setup&lt;/h3&gt;
&lt;p&gt;Cloudflare ships official skills for Claude Code — contextual guidance modules for Workers, Durable Objects, wrangler, the Agents SDK and more. They follow a retrieval-first philosophy: instead of trusting what the model memorized about the platform in 2024, the skill tells it to go look things up.&lt;/p&gt;
&lt;p&gt;:::tip
Installing the skills takes two commands inside Claude Code:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/plugin marketplace add cloudflare/skills
/plugin install cloudflare@cloudflare
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;p&gt;::github{repo=&quot;cloudflare/skills&quot;}&lt;/p&gt;
&lt;p&gt;Then there&apos;s the part that gave my agent actual eyes on production: the &lt;strong&gt;Workers Observability MCP server&lt;/strong&gt;. One command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;claude mcp add cloudflare-observability --transport http https://observability.mcp.cloudflare.com/mcp
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Authenticate via &lt;code&gt;/mcp&lt;/code&gt; (it runs its own OAuth flow against your Cloudflare account), and your agent can now query every log line your Workers emitted in the last seven days: filters, full-text needles, group-bys, percentile calculations. Not &lt;code&gt;wrangler tail&lt;/code&gt; and hope the bug happens again — &lt;em&gt;historical&lt;/em&gt; production telemetry, queryable in structured form.&lt;/p&gt;
&lt;h3&gt;The war story&lt;/h3&gt;
&lt;p&gt;Here&apos;s what sold me. Our OAuth flow for connecting a user&apos;s exchange account started failing in production with:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;OAuth completion failed: &amp;lt;Exchange&amp;gt; API error: Temporary lockout
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;HTTP 400, connection reported as failed. Except… the API key &lt;em&gt;was&lt;/em&gt; created on the exchange, with the correct scopes. The user could see it in their account. Something was claiming failure on a success.&lt;/p&gt;
&lt;p&gt;Old me would have spent the evening in the dashboard: filter by URL, squint at timestamps, open fifteen log entries, correlate by hand. Instead I pasted one sample log line into Claude Code and asked it to investigate.&lt;/p&gt;
&lt;p&gt;What it did, autonomously, was the interesting part:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;First, it read the code before touching the logs.&lt;/strong&gt; It traced the exact throw site: our callback created the API key, then immediately called the exchange&apos;s private balance endpoint as a &quot;verification&quot; step — and treated &lt;em&gt;any&lt;/em&gt; error as fatal. The wallet was never persisted. The key existed on the exchange; we just threw it away and told the user it failed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Then it went to the logs to test the hypothesis.&lt;/strong&gt; My sample log had an ULID for an ID. The agent decoded the timestamp out of it (ULIDs embed milliseconds — I honestly didn&apos;t know), got the exact failure moment, and queried a window around it:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;view&quot;: &quot;events&quot;,
  &quot;timeframe&quot;: { &quot;from&quot;: &quot;…T10:30:00Z&quot;, &quot;to&quot;: &quot;…T12:10:00Z&quot; },
  &quot;parameters&quot;: {
    &quot;filters&quot;: [
      { &quot;key&quot;: &quot;$metadata.service&quot;, &quot;operation&quot;: &quot;eq&quot;, &quot;value&quot;: &quot;workers-prod&quot; },
      { &quot;key&quot;: &quot;$metadata.level&quot;,   &quot;operation&quot;: &quot;eq&quot;, &quot;value&quot;: &quot;error&quot; }
    ],
    &quot;needle&quot;: { &quot;value&quot;: &quot;lockout&quot; }
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Then it zoomed out and grouped.&lt;/strong&gt; Instead of staring at single events, it ran a count grouped by &lt;code&gt;$metadata.trigger&lt;/code&gt; across the whole week. The result was the smoking gun: the &quot;Temporary lockout&quot; error wasn&apos;t an OAuth problem at all. It showed up in &lt;em&gt;four unrelated subsystems&lt;/em&gt; — the balance-refresh endpoint, a deposit-address endpoint, a cron job, a Durable Object alarm doing withdrawal polling. It was account-level throttling state on the exchange&apos;s side, pre-existing before the OAuth callback even ran. A brand-new, perfectly valid API key walked into a locked room.&lt;/p&gt;
&lt;p&gt;The reconstructed timeline read like a detective&apos;s whiteboard:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;11:39  burst of &quot;Invalid key&quot; errors    (a stored wallet with a dead key, hammered by balance refresh)
11:45  cron job hits &quot;Temporary lockout&quot;  ← account already locked, before any OAuth
11:51  OAuth connect: key created OK → balance verification → &quot;Temporary lockout&quot; → 400
11:56  user retries → 500 &quot;Missing idempotency key&quot;   ← a *second*, unrelated bug
11:57  user retries → 500
11:57  user retries → 500
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Along the way it found two bonus bugs I hadn&apos;t asked about: the retry path 500&apos;d because a cookie was missing and the error handler didn&apos;t cover it (so the widget never even got a failure message), and a &lt;code&gt;* * * * *&lt;/code&gt; cron was flooding the logs with hundreds of harmless warnings per minute — which matters more than it used to, because log noise now degrades &lt;em&gt;the agent&apos;s&lt;/em&gt; queries too, not just mine.&lt;/p&gt;
&lt;p&gt;The final root cause turned out to be even better: the exchange applies a ~15-minute security cooldown on private API calls whenever an account connects from a new device or IP — which is &lt;em&gt;literally what an OAuth connect is&lt;/em&gt;. Our synchronous verify-right-after-create design was structurally guaranteed to fail on first connects. The fix wasn&apos;t retry logic; it was persisting the key immediately and deferring the balance check past the cooldown.&lt;/p&gt;
&lt;p&gt;I never opened the Cloudflare dashboard. The agent formed hypotheses from the code, tested them against production telemetry, and revised. That&apos;s antirez&apos;s tireless sculptor — with eyes.&lt;/p&gt;
&lt;h2&gt;Gotchas from the trenches&lt;/h2&gt;
&lt;p&gt;Three things that will bite you, so they don&apos;t have to:&lt;/p&gt;
&lt;p&gt;:::caution
&lt;strong&gt;Your wrangler login can&apos;t query the observability API.&lt;/strong&gt; Before installing the MCP server, my agent tried the REST endpoint directly with the OAuth token from &lt;code&gt;wrangler login&lt;/code&gt; and got a bare &lt;code&gt;code: 10000, Authentication error&lt;/code&gt;. That&apos;s Cloudflare&apos;s confusing way of saying &quot;valid token, missing permission&quot;: the wrangler token only carries the scopes wrangler asks for (&lt;code&gt;workers:write&lt;/code&gt;, &lt;code&gt;workers_tail:read&lt;/code&gt;, …), and the telemetry query endpoint needs &lt;strong&gt;Workers Observability: Read&lt;/strong&gt;. The MCP server sidesteps this entirely by running its own OAuth flow with the right scopes. If you want raw &lt;code&gt;curl&lt;/code&gt; access instead, create a dedicated API token.
:::&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;MCP servers added mid-session need a reconnect.&lt;/strong&gt; &lt;code&gt;claude mcp add&lt;/code&gt; updates the config, but a running Claude Code session won&apos;t see the new server&apos;s tools until you run &lt;code&gt;/mcp&lt;/code&gt; in &lt;em&gt;that&lt;/em&gt; session (or restart it). I lost ten confused minutes to this.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Log hygiene is now agent performance.&lt;/strong&gt; A needle search across a noisy service returns the noise. My first &quot;show me everything around the failure&quot; query came back 100% cron warnings. If you want agents to debug from your logs, treat log spam as a bug with a real cost.&lt;/p&gt;
&lt;h2&gt;Part 2: Local-first TDD is the agent&apos;s other eye&lt;/h2&gt;
&lt;p&gt;Production sight tells you what went wrong. The second loop — the one that makes the agent &lt;em&gt;productive&lt;/em&gt; rather than just diagnostic — is a test suite it can run itself, that answers truthfully, in seconds.&lt;/p&gt;
&lt;p&gt;The unlock on Cloudflare is &lt;a href=&quot;https://developers.cloudflare.com/workers/testing/vitest-integration/&quot;&gt;&lt;code&gt;@cloudflare/vitest-pool-workers&lt;/code&gt;&lt;/a&gt;: your tests don&apos;t run in Node with mocked platform APIs — they run inside &lt;strong&gt;workerd&lt;/strong&gt;, the actual Workers runtime, booted by Miniflare &lt;em&gt;from your real &lt;code&gt;wrangler.jsonc&lt;/code&gt;&lt;/em&gt;. Durable Objects, their SQLite storage, R2, D1, KV, rate limiters: all real implementations, all local, all in-process.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export default defineWorkersConfig({
  test: {
    sequence: { concurrent: false },
    poolOptions: {
      workers: {
        isolatedStorage: false,
        wrangler: { configPath: &apos;./wrangler.jsonc&apos; },  // ← the whole platform, in-process
        moduleRules: [{ type: &apos;Text&apos;, include: [&apos;**/*.sql&apos;] }],
      },
    },
  },
})
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here&apos;s what that enables in practice in my codebase.&lt;/p&gt;
&lt;h3&gt;The database in your tests &lt;em&gt;is&lt;/em&gt; the production database&lt;/h3&gt;
&lt;p&gt;Every tenant in my system is a Durable Object whose &lt;code&gt;ctx.storage&lt;/code&gt; SQLite is managed by drizzle-orm. Migrations run in the DO constructor:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import { drizzle } from &apos;drizzle-orm/durable-sqlite&apos;;
import { migrate } from &apos;drizzle-orm/durable-sqlite/migrator&apos;;
import migrations from &apos;../generated-migrations&apos;;

constructor(ctx: DurableObjectState, env: Env) {
  this.db = drizzle(ctx.storage, { schema: tenantSchema });
  ctx.blockConcurrencyWhile(() =&amp;gt; migrate(this.db, migrations));
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Because vitest boots the same DO class under Miniflare, the local test database has &lt;em&gt;exactly&lt;/em&gt; the production schema — same migrations, same engine, no &quot;SQLite-flavored mock of our Postgres&quot;. (One wrinkle: the Workers sandbox can&apos;t read files off disk, so a small build step code-gens the &lt;code&gt;.sql&lt;/code&gt; migration files into a JS string module before the suite runs. Ugly, effective.)&lt;/p&gt;
&lt;h3&gt;White-box assertions with &lt;code&gt;runInDurableObject&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;cloudflare:test&lt;/code&gt; exposes a magic escape hatch: reach &lt;em&gt;inside&lt;/em&gt; a Durable Object instance and run assertions against its private state.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;const identities = await runInDurableObject(orgDb, async (instance: TenantDurableObject) =&amp;gt; {
  const db = (instance as any).db;
  return db.select().from(cexIdentities).all();
});
expect(identities).toHaveLength(0);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is the difference between &quot;the endpoint returned 200&quot; and &quot;the row actually landed, with the secret encrypted at rest&quot;. My suite uses it in 46 test files.&lt;/p&gt;
&lt;h3&gt;Third-party APIs become hard assertions&lt;/h3&gt;
&lt;p&gt;The scariest part of an exchange integration is the outbound calls — the part agents most love to hallucinate. &lt;code&gt;fetchMock&lt;/code&gt; from &lt;code&gt;cloudflare:test&lt;/code&gt; turns that into a contract:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;beforeEach(() =&amp;gt; {
  fetchMock.activate();
  fetchMock.disableNetConnect();   // any unmocked outbound call = test failure
});

fetchMock.get(&apos;https://api.exchange.example&apos;)
  .intercept({ method: &apos;POST&apos;, path: &apos;/oauth/token&apos; })
  .reply(200, oauthTokenSuccessFixture);

// …run the flow…

fetchMock.assertNoPendingInterceptors();  // every expected call actually happened
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;disableNetConnect()&lt;/code&gt; means the agent &lt;em&gt;cannot&lt;/em&gt; accidentally test against the real internet, and a hallucinated extra API call fails loudly instead of silently working-ish. &lt;code&gt;assertNoPendingInterceptors()&lt;/code&gt; means a &lt;em&gt;missing&lt;/em&gt; call fails too. The mock isn&apos;t a stub; it&apos;s a spec.&lt;/p&gt;
&lt;h3&gt;The golden loop&lt;/h3&gt;
&lt;p&gt;Put together, one test exercises the entire vertical: mock the exchange&apos;s three endpoints → invoke the real Hono route → assert the HTTP response, the mock contract, &lt;em&gt;and&lt;/em&gt; the Durable Object&apos;s SQLite state:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;it(&apos;completes OAuth → API key → balance → wallet storage&apos;, async () =&amp;gt; {
  fetchMock.get(EXCHANGE).intercept({ path: &apos;/oauth/token&apos;, method: &apos;POST&apos; }).reply(200, tokenFixture);
  fetchMock.get(EXCHANGE).intercept({ path: &apos;/oauth/api-key&apos;, method: &apos;POST&apos; }).reply(200, keyFixture);
  fetchMock.get(EXCHANGE).intercept({ path: &apos;/private/Balance&apos;, method: &apos;POST&apos; }).reply(200, balanceFixture);

  const response = await app.request(callbackUrl, { headers }, env);

  expect(response.status).toBe(200);
  fetchMock.assertNoPendingInterceptors();

  const wallet = await runInDurableObject(orgDb, (i: TenantDurableObject) =&amp;gt; i.getWallet(&apos;wallet-123&apos;));
  expect(wallet).toMatchObject({ exchange: &apos;exchange&apos;, type: &apos;long-living&apos; });
  expect(wallet!.apiSecret).not.toBe(keyFixture.result.secret); // encrypted at rest
});
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Why does this matter &lt;em&gt;specifically for agents&lt;/em&gt;? Go back to the sculptor:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Speed feeds tenacity.&lt;/strong&gt; &lt;code&gt;npx vitest run test/oauth2/callback.test.ts&lt;/code&gt; gives the agent a red/green verdict on the full stack in seconds. Each stone thrown is instantly evaluated. Fifty iterations cost minutes, not days.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Determinism keeps the feedback truthful.&lt;/strong&gt; No flaky staging, no shared environment drift, no &quot;worked on my machine&quot;. Miniflare state is wiped at the start of each run.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Strictness catches hallucinations.&lt;/strong&gt; The combination of &lt;code&gt;disableNetConnect&lt;/code&gt; + &lt;code&gt;assertNoPendingInterceptors&lt;/code&gt; is an anti-hallucination device: the agent can&apos;t invent an API interaction that &quot;probably exists&quot; — the contract is executable.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;It&apos;s self-serve.&lt;/strong&gt; The agent doesn&apos;t ask me to click through a UI to verify. It writes the failing test, makes it pass, and shows me the output. TDD was always a feedback-loop discipline; agents are simply the first developers tenacious enough to exploit it fully.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;(Honesty aside: being this local-first on a young platform has costs. I&apos;m currently shipping on patched community forks of drizzle-orm and better-auth to make the adapters behave. Early-adopter tax.)&lt;/p&gt;
&lt;h2&gt;The honest 100x&lt;/h2&gt;
&lt;p&gt;&quot;100x&quot; is a big claim, so let me locate it precisely. It&apos;s not typing speed. It&apos;s the product of &lt;em&gt;iteration count&lt;/em&gt; × &lt;em&gt;truthfulness of feedback&lt;/em&gt;, and it looks like this:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Me, manually&lt;/th&gt;
&lt;th&gt;Agent with eyes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&quot;Why did this 400 happen in prod?&quot;&lt;/td&gt;
&lt;td&gt;30–60 min of dashboard spelunking, if I&apos;m lucky&lt;/td&gt;
&lt;td&gt;One prompt; agent correlates code + a week of logs, returns a timeline and two bonus bugs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&quot;Did I just break the withdrawal flow?&quot;&lt;/td&gt;
&lt;td&gt;Deploy to staging, click through the widget&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vitest run&lt;/code&gt; — full stack verdict in seconds, DO state included&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&quot;Do we call the exchange API correctly?&quot;&lt;/td&gt;
&lt;td&gt;Read their docs again, hope&lt;/td&gt;
&lt;td&gt;&lt;code&gt;assertNoPendingInterceptors()&lt;/code&gt; — the contract is a test&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&quot;Is this platform API still shaped like I remember?&quot;&lt;/td&gt;
&lt;td&gt;Tab-switch to docs&lt;/td&gt;
&lt;td&gt;Cloudflare skill retrieves current docs instead of trusting training data&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The agent was always tenacious. It was always fast. Those were never the bottleneck — sight was. Wire up production telemetry it can query and a local world it can simulate, and the tireless worker in front of the marble finally watches where each stone lands.&lt;/p&gt;
&lt;p&gt;Now it sculpts.&lt;/p&gt;
&lt;h2&gt;Credits &amp;amp; links&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Salvatore Sanfilippo (antirez), &lt;a href=&quot;https://youtu.be/TJ6ruN-o0PA&quot;&gt;&lt;em&gt;Il trucco decisivo (davvero) per lavorare coi coding agent&lt;/em&gt;&lt;/a&gt; — the blind-sculptor framing that inspired this post. Grazie.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/cloudflare/skills&quot;&gt;cloudflare/skills&lt;/a&gt; — official Agent Skills for Claude Code and other agents.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://developers.cloudflare.com/agent-setup/claude-code/&quot;&gt;Cloudflare agent setup guide for Claude Code&lt;/a&gt; — skills + MCP servers, including the &lt;a href=&quot;https://observability.mcp.cloudflare.com/mcp&quot;&gt;Observability MCP server&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://developers.cloudflare.com/workers/testing/vitest-integration/&quot;&gt;Vitest integration for Workers&lt;/a&gt; — &lt;code&gt;@cloudflare/vitest-pool-workers&lt;/code&gt;, &lt;code&gt;runInDurableObject&lt;/code&gt;, &lt;code&gt;fetchMock&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
</content:encoded><author>Florian Sabani</author></item></channel></rss>