AI Coding News

May 27, 2026

Key Signals

  • Cognition raised over $1 billion at a $25 billion pre-money valuation, more than doubling its valuation in eight months. The round was led by Lux Capital, General Catalyst, and 8VC, and represents a major vote of confidence in independent AI coding startups at a time when model makers like Anthropic, OpenAI, and Google were expected to dominate the space. Cognition reports $492 million in annualized revenue run-rate with 50% month-over-month enterprise growth, counting Mercedes-Benz, NASA, and Goldman Sachs as customers. [1]

  • Managed AI agent runtimes have become table stakes after Google, Anthropic, and AWS shipped nearly identical products within six weeks. Google repositioned Antigravity as an agent platform at I/O, following Anthropic's Claude Managed Agents (April 8) and AWS Bedrock AgentCore (April 22). The Markdown-based config format — now in over 60,000 repositories and stewarded by the Linux Foundation — is quietly becoming the portable agent definition layer, meaning platform choice now hinges on cost, data residency, and model quality rather than runtime features. [2]

  • Security researchers warn of a growing accountability gap as AI coding agents autonomously install packages that nobody reviews. Aikido Security's CEO describes situations where "there is no accountability" when agents pull dependencies — a gap affecting enterprises where non-developer teams also use AI tools. The supply-chain attack surface is expanding rapidly: Snyk's audit found over a third of nearly 4,000 AI agent skills contain security flaws, while multiple startups (Socket at $1B valuation, Endor Labs, Chainguard, Arcjet) race to fill the gap. [3]

  • Claude Code v2.1.152 shipped /code-review --fix which auto-applies review findings, while removing opt-in consent for Auto mode. Skills can now declare disallowed-tools in frontmatter, a new /reload-skills command enables hot-reloading without session restart, and SessionStart hooks gained the ability to set session titles and trigger skill reloads. The release also adds a MessageDisplay hook for transforming assistant output in real-time, signaling Anthropic's push toward fully extensible agent pipelines. [4]

  • Copilot SDK v1.0.0-beta.9 introduced CopilotClientMode.Empty for multi-tenant isolation, post-tool-use failure hooks, and per-message agentMode selection — with GA planned approximately one week out. The Empty mode prevents user-specific state from leaking across tenants, addressing a critical enterprise deployment concern. The agentMode API finally lets SDK consumers programmatically request plan, autopilot, or shell mode per message across all six language SDKs. [5]

  • Uber's CTO revealed their Claude Code budget was "blown away already," highlighting tokenmaxxing as an industry-wide problem. Lanai debuted Token Tuner to map token spend to actual workflow outcomes and generate productivity scores, while Uber's COO described a "head-exploding moment" forcing the company to justify token consumption vs. headcount. The shift from raw usage metrics to outcome measurement represents a maturation signal for enterprise AI coding adoption. [6]

  • Copilot CLI shipped six releases in a single day (v1.0.55-2 through v1.0.55-7), headlined by the new /autopilot command, cell-based terminal renderer GA, and hook progress streaming. The /autopilot command keeps agents focused on long-running objectives, while plugin directories on session RPC let SDK clients mount skills per session. The cell-based renderer becoming default for all users marks a significant UX milestone for terminal-based AI coding. [7][8][9][10][11]

AI Coding News

  • OpenAI and Cisco announced a partnership to redefine enterprise engineering with Codex, enabling AI-native development at scale. Cisco is using Codex to accelerate AI Defense work and automate defect remediation across their engineering organization. This represents another major enterprise validation of Codex's cloud-based coding agent capabilities following its general availability. [12]

  • Warp is using GPT-5.5 and OpenAI models to coordinate coding agents across local, cloud, and open-source development workflows. The terminal company's approach of orchestrating agents across multiple execution environments represents an emerging pattern in AI-assisted development where the boundary between local and cloud coding dissolves. [13]

  • A third-party Claude Code skill called "ADHD" claims to make the agent "think 2x better" by fanning out parallel divergent thoughts under different cognitive frames. Built on Claude Agent SDK, ADHD uses tree-of-thought with cognitive-frame branching, generator-critic separation, and pruning. Experts are skeptical: Sean Robinson notes it's "a familiar parallel sampling and selection strategy" while Andrew Moore acknowledges "the genuinely new idea is finding another way to create diversity in a set of parallel thinkers." The benchmarks are based on only six problems. [14]

  • Pullfrog, an open-source AI-powered GitHub bot by Zod creator Colin McDonnell, entered beta as a model-agnostic alternative to CodeRabbit. Running entirely within GitHub Actions with a bring-your-own-key approach, Pullfrog handles PR reviews, issue triage, CI autofix, and merge conflict resolution. McDonnell describes it as "a harness over OpenCode & Claude Code intended to be run in CI." [15]

  • Microsoft added sandboxed code interpreters to Azure Logic Apps, enabling AI agents to generate and execute Python, JavaScript, C#, and PowerShell in Hyper-V isolated sessions. Each session runs in hardware-level isolation, with the LLM receiving natural-language instructions, generating code, and executing it within a single governed workflow. This positions Logic Apps as an integration-focused agent platform alongside Microsoft Foundry and Copilot Studio. [16]

  • AWS DevOps Agent detailed its multi-agent architecture for autonomous incident investigation, using parallel hypothesis generation and counter-evidence validation. The system decomposes operations into Triage, Investigation, Mitigation, and Prevention capabilities — all built on an application topology graph that provides architectural awareness. The approach mirrors how experienced SRE teams work but at machine speed, generating multiple competing theories and only converging on root cause when evidence conclusively supports it. [17]

  • OpenAI, Thrive, and Crete demonstrated building a self-improving tax agent with Codex that automates filings and improves accuracy over time. The case study showcases Codex's capabilities for autonomous agents in regulated domains where accuracy is critical, extending the tool's positioning beyond pure software engineering into domain-specific workflow automation. [18]

Feature Update

  • Claude Code v2.1.152 shipped a major feature set including /code-review --fix, skills extensibility improvements, and auto mode changes. The /code-review --fix command applies review findings directly to the working tree, with /simplify now invoking it as a shortcut. Skills gain disallowed-tools frontmatter support for removing tools while active, and SessionStart hooks can now return reloadSkills: true and set session titles. Auto mode no longer requires opt-in consent. Additional improvements include Vim mode / for reverse history search, live thinking timer in fullscreen mode, and OpenTelemetry session entrypoint metrics. [4]

  • Copilot SDK v1.0.0-beta.9 added multi-tenant isolation, failure hooks, and programmatic mode selection across all six language SDKs. The CopilotClientMode.Empty starts sessions from a clean slate to prevent cross-tenant state leakage. The postToolUseFailure hook enables observing failed tool executions separately from success-only hooks. The agentMode field on MessageOptions resolves a gap where there was previously no correct way to request plan or autopilot mode from the SDK. The Rust SDK also received breaking error type refactoring to a struct-with-kind() pattern. [5]

  • Copilot CLI v1.0.55-6 added the /autopilot command and enabled the cell-based terminal renderer for all users. Extension log files are now captured per extension and surfaced in the extensions_manage tool, project extensions in .github/extensions work in non-git workspaces, and /statusline and /theme commands can run during agent execution. PowerShell 7 detection was fixed for Microsoft Store App Execution Aliases. [7]

  • Copilot CLI v1.0.55-3 introduced hook progress streaming, plugin directory mounting via RPC, and reasoning token visibility. Long-running hooks now show real-time status messages in the timeline, SDK clients can mount Open Plugins-format directories per session via pluginDirectories on session.create and session.resume RPC. Progress indicators integrate natively with tmux 3.6b pane progress state, and reasoning token count is now shown in session token summary for all users. The skill precedence order is now: project > plugin-dir > personal > custom. [8]

  • Copilot CLI v1.0.55-7 fixed the exit_plan_mode tool appearing outside of plan mode and added SIGSEGV crash fallback to JavaScript. Native binary crashes now fall through to the JavaScript runtime instead of silently exiting, improving resilience of the CLI in crash scenarios. [9]

  • Copilot CLI v1.0.55-5 redesigned MCP configuration with a dedicated scrollable screen for server and tool management. MCP configuration now opens in its own screen with scrollable server and tool lists when content exceeds the visible area, addressing usability issues for users with many configured MCP servers. [10]

  • Gemini CLI v0.44.0 merged Auto modes into a single unified Auto mode and added AgentSession invocations. The release includes agent registration with first-wins project priority, Sublime Text and Emacs editor support, gemini-3.1 model aliases and thinking config, ADK subagent flags, and PolicyEngine integration into ACP sessions. Security fixes include path traversal prevention in custom commands, NO_PROXY respect for MCP servers, and dependency vulnerability updates. Over 70 pull requests were merged in this release. [19]

  • Gemini CLI v0.45.0-preview.0 shipped a preview release with A2A usage metadata exposure and context simplification. Fixes include Termux relaunch/resize loop prevention, sequential execution of the update_topic tool, routing classifier bypass for orphaned function responses, PTY resize error suppression, and MCP list blacklist bypass prevention. [20]

  • OpenCode v1.15.11 added experimental background agents with push-based updates and provider request timeout configuration. Background agents now push updates without polling, headerTimeout config defaults to 10s for OpenAI setups, and modalities.input/output can be set independently. The release also adds a dispose hook for plugins and fixes Google tool calling after an upstream tool ID regression. Resumed sessions no longer continue orphaned interrupted tools. [21]

  • Codex released 0.135.0-alpha.2, building on v0.134.0's conversation history search and profile-based configuration. The previous stable release (v0.134.0) added local conversation history search with case-insensitive content matches, made --profile the primary selector across CLI/TUI/sandbox, improved MCP setup with per-server environments and OAuth options, and enabled read-only MCP tools to run concurrently when they advertise readOnlyHint. [22]

  • Kiro added user email addresses to daily activity reports for enterprise admin visibility. The User_Email column now appears alongside existing fields, eliminating the need for admins to cross-reference user IDs with a separate directory to identify activity sources. [23]