Applied AI Digest: Week of 4/27

Launches & Releases

OpenAI ships GPT-5.5 with a 1M-token context window
GPT-5.5 went live in the API and ChatGPT on April 24 at $5 / $30 per million in/out tokens, with a 1M context window and a smaller cache discount than the 5.4 line. OpenAI claims meaningful gains on agentic coding, computer use, and long-horizon work, and the model is already GA in GitHub Copilot. The price is a step up from 5.4, but Copilot's switch alone is reason enough to re-benchmark before you renew a 5.4 contract.

ChatGPT Workspace Agents evolve Custom GPTs into cloud-running automations
OpenAI's evolution of Custom GPTs runs in the cloud on a schedule, ships with Slack, Google Workspace, and Salesforce integrations, and is powered by Codex. Free in research preview through May 6 on Business, Enterprise, Edu, and Teachers plans. Custom GPTs were a chat surface you opened on demand; Workspace Agents go and do work on a trigger, which puts OpenAI on the same agent footprint Anthropic and Google have already shipped.

Clod is a Claude clone that errs on purpose to train user skepticism
An indie chatbot that intentionally introduces errors so users practice spotting them before the bot does. Cute idea wrapped around a real point: every team shipping LLM-powered tooling wants users to review output critically, but few products actually train that skill. If you're building an internal AI app where review quality matters, this is worth five minutes of poking around.

Reads & Postmortems

Hamel Husain's field guide to rapidly improving AI products
Distilled lessons from 30+ production AI projects. The thesis: teams that succeed obsess over error analysis, custom data viewers, and synthetic test generation, not frameworks. The companion FAQ, LLM-as-judge, and evals-skills posts are worth the read alongside it. If you're standing up evals on a current project, start here.

Anthropic's April 23 postmortem details three Claude Code regressions
Anthropic owns up to three bugs spanning March and April: a reasoning-effort default that quietly dropped from high to medium, a thinking-cache bug that wiped prior reasoning every turn instead of once, and a verbosity prompt that cost 3% on coding evals. The thinking-cache bug ran for two weeks before detection. Worth reading in full if your team has been chasing "feels worse" anecdotes on Sonnet 4.6 or Opus 4.6.

Cursor agent wipes a production database and snapshots in nine seconds
A Cursor session deleted a company's production database and its snapshots after the agent decided cleanup was in scope. Root cause is light, but the lesson is the lesson everyone keeps not learning: do not give long-running agents write access to production data without a hard isolation boundary.

About Fractional AI: We're an elite engineering team transforming businesses through custom AI software. We work hand in hand with companies to tackle their highest-value AI projects — driving efficiency, unlocking new revenue streams, and solving problems off-the-shelf tools can't. See more at fractional.ai!

Applied AI Digest: Week of 4/27

Launches & Releases

Reads & Postmortems

Keep Reading