May 18, 2026
Key Signals
-
GitHub ships a coordinated wave of Copilot platform expansions, making its coding agent multi-surface and more autonomous. Remote control for Copilot CLI sessions reached GA across Mobile, web, VS Code, and JetBrains, allowing developers to start terminal sessions and steer them from any device. Simultaneously, GitHub introduced one-click fixes for failing Actions via Copilot cloud agent, added cost-efficient models (Claude Haiku 4.5, GPT-5.4-mini at 0.33x multiplier), and made the Copilot Spaces API generally available. This batch of releases positions GitHub Copilot as a persistent, always-available agent layer rather than a session-bound assistant. [1][2][3][4]
-
Cursor launches Composer 2.5 with novel RL training techniques, signaling a new frontier in custom-trained coding models. Built on Moonshot's Kimi K2.5 checkpoint, Composer 2.5 uses targeted RL with textual feedback—a technique that inserts hints at specific error points and distills corrected behavior—plus 25x more synthetic tasks than its predecessor. Cursor also revealed it is training a significantly larger model from scratch with SpaceXAI using 10x more total compute on Colossus 2's million H100-equivalents, suggesting that AI coding companies are now investing in proprietary model training at frontier scale. [5]
-
OpenAI Codex v0.131.0 introduces a Python SDK rebrand, plugin marketplace, and daemon-managed remote-control workflows. The Python SDK officially moved to
openai-codex/openai_codexwith pinned runtime-generated types and approval modes, while the TUI gained unified@mentions that search files, directories, plugins, and skills in one picker. The newcodex doctorcommand provides comprehensive diagnostics, and plugin hooks are now enabled by default—reflecting Codex's evolution from a standalone tool into an extensible platform with first-class plugin infrastructure. [6] -
GPT-5.3-Codex becomes the first long-term support model for GitHub Copilot Business and Enterprise, guaranteeing 12-month availability. Replacing GPT-4.1 as the base model for all Business/Enterprise organizations, GPT-5.3-Codex carries a 1x premium request multiplier and will remain available through February 4, 2027. GitHub reports a "significantly high code survival rate" with this model. The LTS designation gives enterprises the stability needed for internal security reviews—addressing a key barrier to enterprise adoption of AI coding tools. [7]
-
Copilot CLI v1.0.49 adds /rubber-duck critique, /chronicle search, persistent memory management, and Alpine Linux support. New experimental features include
/mcp searchfor discovering and installing MCP servers from a registry and tool search with deferred loading. The/rubber-duckcommand invokes an independent critique agent to review the current work, introducing a built-in adversarial review loop. Hook support now fires correctly for sub-agent tool calls, and the CLI runs on Alpine Linux for the first time, expanding deployment scenarios to lightweight containers. [8] -
OpenAI and Dell partner to bring Codex to hybrid and on-premise enterprise environments, addressing data residency and security compliance requirements. This partnership allows organizations to run Codex within their own infrastructure using Dell's hardware and services, expanding beyond cloud-only deployments. For regulated industries that cannot send code to external APIs, on-premise Codex deployment removes a fundamental adoption barrier. [9]
-
Google's leaked "Remy" agent project has enterprise architects rethinking the AI stack, highlighting the need for durable workflow runtimes beneath autonomous agents. Industry experts argue that once agents move from request-response to continuously running delegated execution, the underlying infrastructure must handle retries, partial failure, state consistency, auth propagation, and policy enforcement—transforming AI apps into distributed systems. Deterministic policy engines and hardened runtime containment are emerging as critical infrastructure layers for production agents. [10]
AI Coding News
-
Anthropic's "Code with Claude 2026" event revealed managed agents, an advisor/executor model strategy, and 80x annualized revenue growth. The San Francisco conference showcased Claude Code's new auto mode, worktrees for isolated branches, and routines running on cron/webhooks. GitHub CPO Mario Rodriguez disclosed that GitHub targets 94%+ cache hit rates when calling Claude, treating prompt assembly efficiency like high-frequency trading. Vercel CEO Guillermo Rauch reported that Opus tokens represent ~20% of their AI Gateway usage but 70%+ of spend. Claude moved from 62% to 87% on SWE-bench Verified (Sonnet 3.7 → Opus 4.7) over the past year. [11]
-
Google's reported "Remy" project represents a shift toward OpenClaw-style agents that perform actions autonomously on behalf of users. Described in an internal document as a "24/7 personal agent for work, school, and daily life, powered by Gemini," Remy is reportedly being tested inside a staff-only Gemini version. The project reinforces the emerging pattern of long-running autonomous agent workflows, with infrastructure requirements including durable execution graphs, asynchronous orchestration, and delegated permissions across multiple services. Banking, healthcare, and regulated sectors are responding with "military-grade containment" for critical workloads. [10]
-
OpenAI and Dell announce a partnership to deploy Codex in hybrid and on-premise enterprise environments. The collaboration enables organizations to run AI coding agents securely within their own infrastructure using Dell's enterprise hardware, targeting data residency requirements, security compliance, and network isolation constraints that have prevented some organizations from adopting cloud-only AI coding tools. [9]
Feature Update
-
Cursor Composer 2.5 launches with targeted RL, 25x more synthetic tasks, and dual pricing tiers. Composer 2.5 is a substantial improvement in intelligence and sustained long-running task performance over Composer 2. Training innovations include targeted textual feedback, feature-deletion synthetic tasks grounded in real codebases, and sharded Muon with dual mesh HSDP for MoE models. Pricing: $0.50/$2.50 per M tokens or $3.00/$15.00. Includes double usage for the first week. [5]
-
GitHub Copilot CLI remote control reaches general availability across Mobile, web, VS Code, and JetBrains. Developers can now start a Copilot CLI session in any terminal and monitor or steer it in real time from GitHub Mobile, github.com, VS Code, or JetBrains. New in GA: support for non-GitHub repositories, real-time session streaming, plan review and editing from mobile, permission request handling, and
/keep-alivefor persistent sessions. Start withcopilot --remoteor/remote onmidsession. [1] -
GitHub Copilot cloud agent gains one-click fixes for failing GitHub Actions jobs. Copilot Business and Enterprise subscribers see a "Fix with Copilot" button on workflow run logs. Clicking it triggers the cloud agent to investigate the failure, push a fix to the branch, and tag the user for review—all from its own cloud-based development environment. Designed for delegating test fixes and linter corrections. [2]
-
GitHub Copilot cloud agent expands to cost-efficient models: Claude Haiku 4.5 and GPT-5.4-mini at 0.33x multiplier. Users can now pick smaller, quicker models for straightforward changes, reserving more capable models for complex work. Both new models carry a 0.33x premium request multiplier, making them three times cheaper per request than the standard 1x models. [3]
-
Copilot Spaces API is now generally available for programmatic context management. The REST API allows applications to create, read, update, and delete Spaces, manage collaborators and resources, reducing manual overhead for enterprises managing multiple Spaces at scale. [4]
-
GPT-5.3-Codex becomes the base model for Copilot Business and Enterprise with LTS guarantees through February 2027. This is the first long-term support model in the Copilot ecosystem. GPT-4.1 remains temporarily available at 0x multiplier until deprecation on June 1, 2026 with the launch of usage-based billing. Applies only to Business and Enterprise plans. [7]
-
Copilot CLI v1.0.49-6 pre-release ships with /rubber-duck, /chronicle search, /memory management, and MCP registry search. Key additions:
/rubber-duckfor independent critique of agent work,/chronicle searchfor searching all session content,/memory on|off|showfor persistent memory control,copilot plugin update --allfor batch plugin updates, Alpine Linux support, and experimental/mcp searchfor discovering MCP servers from a registry. Hooks now fire correctly for sub-agent tool calls. [8] -
OpenAI Codex v0.131.0 delivers unified @ mentions, Python SDK rebrand, plugin marketplace CLI, and
codex doctordiagnostics. The TUI now shows blended token usage, permissions/approval mode, and responsive Markdown tables. Remote workflows gained daemon-managedcodex remote-controlwith runtime enable/disable APIs. Windows sandbox hardening addresses deny-read rules, scoped write roots, and PowerShell edge cases. The extension system underwent major refactoring with shared tool contracts and memory extension plumbing. [6] -
OpenCode v1.15.5 previews a native OpenAI runtime path and adds --replay for session history. The experimental flag enables the native OpenAI runtime path, while
--replayand--replay-limitshow recent history when resuming interactive runs. Bug fixes address plugin tool completion, event subscription races, and session list sorting. Desktop improvements include notifications, usage dialogs, and faster session timelines. [12] -
Gemini CLI v0.44.0-nightly.20260518 adds ADK agent session subagent support. The nightly build introduces the
adk.agentSessionSubagentEnabledflag, enabling Agent Development Kit session-level subagent capabilities. This is a single-feature nightly between the v0.43.0-preview.0 and upcoming stable release. [13]