AI Coding News

April 16, 2026

Key Signals

OpenAI ships a sweeping Codex overhaul that adds background computer use, an in-app browser, and 90+ plugins — the clearest sign yet that the company is building a unified AI "superapp." Codex can now control desktop applications with its own virtual cursor on Mac, running multiple agents in parallel without interfering with the user's workflow. An in-app browser lets developers annotate web pages for frontend feedback, while new heartbeat automations enable persistent agents that monitor Slack, triage inboxes, or wake themselves on a schedule. With 3 million weekly users and a million new users per month, Codex is rapidly expanding beyond coding into general knowledge work. [1][2][3]
Claude Opus 4.7 rolls out across all major platforms with improved instruction-following, vision, and memory — but a new tokenizer and heavier adaptive thinking will increase token consumption. Anthropic's latest Opus model is now available on the Anthropic API, Amazon Bedrock, Google Cloud Vertex AI, Microsoft Foundry, and GitHub Copilot (at a promotional 7.5× premium request multiplier until April 30). Early-access testers including Intuit, GitHub, and Notion report stronger multi-step task performance. However, the model's more literal instruction-following means existing prompts may produce unexpected results, and Anthropic cautions that users will see higher token usage. [4][5]
GitHub CLI launches gh skill, a cross-agent skill management command that works across Copilot, Claude Code, Cursor, Codex, Gemini CLI, and Antigravity. Agent skills — portable instruction sets that teach AI agents specific tasks — can now be discovered, installed, version-pinned, and published from a single CLI. The command includes supply-chain integrity features like content-addressed change detection, immutable releases, and provenance metadata in SKILL.md frontmatter, bringing package-manager-grade guarantees to the agent skills ecosystem. [6]
"Computer use" is emerging as the next competitive frontier, with Codex, HuggingFace HoloTab, and Claude Code all pushing toward AI agents that operate software through the UI rather than APIs. Codex's background desktop control, Anthropic's existing Mac-level Claude Code capabilities, and HuggingFace's new HoloTab Chrome extension (powered by the Holo3-35B-A3B model) all reflect a convergence on letting agents click, type, and navigate applications like a human would. This approach sidesteps the need for pre-built integrations and opens automation possibilities for legacy tools, internal dashboards, and web apps that lack APIs. [1][7]
Factory raises $150 million at a $1.5 billion valuation, underscoring sustained investor appetite for enterprise AI coding agents despite an increasingly crowded market. The round, led by Khosla Ventures with participation from Sequoia Capital, Insight Partners, and Blackstone, positions Factory alongside Anthropic, Cursor, and Cognition as serious contenders in the enterprise code generation space. Factory differentiates by dynamically switching between foundation models and counts Morgan Stanley, Ernst & Young, and Palo Alto Networks among its customers. [8]
Developer frustration with Claude Code's product trajectory is mounting as Anthropic's capacity constraints force a series of changes perceived as degradations. A widely circulated blog post catalogs multiple recent rollbacks: removal of the "clear context and execute" option in plan mode, blocking third-party tools from using Pro/Max subscription tokens, cache TTL reduction from 1 hour to 5 minutes, and — with Opus 4.7 — the outright removal of extended thinking budgets in favor of adaptive thinking only. These changes affect all users including API-paying customers, and the pattern is drawing comparisons to enshittification. [9]

AI Coding News

Claude Opus 4.7 delivers meaningful gains in instruction-following and vision but raises token-cost concerns and makes no safety breakthroughs. Anthropic's newest Opus model accepts images with more than 3× the pixels of its predecessor, features improved file-system-based memory, and achieves state-of-the-art scores on the GDPval-AA benchmark for finance and legal tasks. Safety metrics remain roughly equivalent to Opus 4.6, with a modest regression in harm-reduction advice for controlled substances. Anthropic frames Opus 4.7 as the first less-capable model to receive cyber safeguards originally developed for the unreleased Mythos-class models. [5]
Cursor 3 redesigns the developer interface around managing parallel agents rather than editing files, marking a sharp philosophical split in the AI coding tool space. Internal data shows a complete inversion from March 2025 — twice as many Cursor users now run autonomous agents as use tab completion, and 35% of the company's own PRs are written by cloud agents. The new workspace surfaces all agents in one sidebar, supports local-to-cloud handoff, and adds a plugin marketplace. Community reception is sharply divided: some users praise the agent-first direction while others report spending $2,000/week on premium models before switching to Claude Code Max at one-tenth the cost for comparable output. [10]
Cloudflare's Code Mode MCP server reduces token consumption by 99.9% when agents interact with large API surfaces, potentially changing how MCP servers are designed. Instead of exposing each of Cloudflare's 2,500+ API endpoints as individual MCP tools (costing 1.17 million tokens), Code Mode offers just two meta-tools — search() and execute() — backed by a type-aware SDK that lets the model write and run JavaScript in a secure V8 isolate against the OpenAPI spec. The fixed ~1,000-token footprint holds regardless of API surface size. Cloudflare has open-sourced a Code Mode SDK within its Agents SDK for third-party adoption. [11]
Google releases Gemma 4 under Apache 2.0, giving developers unrestricted access to open-weight models with native function-calling and 256K context windows for agentic workflows. The family includes 2B and 4B edge models, a 26B MoE model (activating only 3.8B parameters at inference), and a 31B dense model scoring 84.3% on GPQA Diamond — nearly double the previous Gemma 3 result. Native video, image, and audio processing plus structured JSON output make these models viable drop-in options for agent pipelines. The Apache 2.0 license is a first for Google's best open model, removing all commercial restrictions. [12]
Amazon is deepening its investment in the Model Context Protocol, contributing Tasks and Elicitations to the MCP spec and using its managed MCP servers as a testing ground for draft features. AWS Senior Principal Engineer Clare Liguori, who is also an MCP core maintainer, highlighted the shift toward always-on agents at the MCP Summit in New York. Amazon has also expanded its Kiro AI development tool to all roles company-wide after discovering that non-engineers wanted access just as much as developers. [13]
Hugging Face launches HoloTab, a Chrome extension that navigates websites like a human, joining a growing field of "computer use" agents from Anthropic, OpenAI, and Google. Built on the Holo3-35B-A3B model, HoloTab handles tasks like form-filling, message replies, and professional networking outreach directly in the browser without requiring site-specific integrations. The approach complements rather than replaces MCP-style structured access: MCP adapts software for AI, while computer use adapts AI to existing software. [7]
Spotify has gone "agentic-first" in its development process, with its best engineers reportedly no longer writing code directly. The company's senior project manager and senior staff engineer will discuss how Spotify is reforming squads, redefining engineering roles around intent rather than implementation, and deploying agentic fleets into DevOps, security, and cloud management. The webinar, scheduled for April 29, will also address whether the approach translates to smaller enterprises without Spotify's scale and budget. [14]

Feature Update

OpenAI Codex receives its largest update to date with background computer use, in-app browser, image generation, memory, and 111 plugin integrations. The new Codex desktop app (backed by the v0.122.0 engine release) lets agents control Mac desktop applications with a virtual cursor in the background while the user continues working. An in-app browser based on Atlas supports inline commenting for frontend feedback. Image generation via gpt-image-1.5 is included at no extra cost. Heartbeat automations allow persistent threads that fire on a schedule, and a new memory system recalls preferences across sessions. Developer-specific additions include GitHub review comment handling, multiple terminal tabs, SSH to remote dev boxes, and a summary pane for tracking plans and artifacts. New pay-as-you-go pricing is available for ChatGPT Enterprise and Business customers. [1][2][3][17][23]
Claude Code v2.1.111 adds Opus 4.7 xhigh effort level, cloud-based /ultrareview, and auto mode for Max subscribers. The xhigh effort level sits between high and max and is available via /effort, --effort, and the model picker. The new /ultrareview command runs comprehensive code review using parallel multi-agent analysis in the cloud — invoke with no arguments to review your current branch or pass a GitHub PR URL. Auto mode no longer requires the --enable-auto-mode flag. Other additions include an /effort interactive slider, /less-permission-prompts skill for proposing read-only allowlists, and progressive PowerShell tool rollout on Windows. Numerous fixes address terminal tearing in iTerm2+tmux, LSP diagnostic ordering, plugin error handling, and rate-limit error messages on Bedrock/Vertex. [15]
Claude Code v2.1.112 hotfixes the "claude-opus-4-7 is temporarily unavailable" error that affected auto mode immediately after the Opus 4.7 rollout. [16]
Copilot CLI ships four releases (v1.0.28–v1.0.31) in a single day, adding Claude Opus 4.7 support, /statusline customization, and remote control session resumption. v1.0.29 adds Claude Opus 4.7 model support, the --list-env flag for CI pipeline debugging, and COPILOT_AGENT_SESSION_ID as an environment variable for shell commands and MCP servers. v1.0.30 introduces the /statusline command for customizing status bar items and restores clipboard image paste. v1.0.28 adds remote control session connection from the --resume picker, COPILOT_DISABLE_TERMINAL_TITLE support, and improved MCP migration documentation. v1.0.31 fixes prompt frame rendering on Windows and Ubuntu terminals. [18][19][20][21]
GitHub CLI gh skill launches in public preview, bringing package-manager-grade integrity to the agent skills ecosystem. Skills are portable instruction sets following the open Agent Skills specification at agentskills.io. The gh skill install command auto-detects the correct directory for each agent host, while gh skill publish validates skills against the spec and checks repository security settings. Version pinning supports both release tags and commit SHAs, and provenance metadata written into SKILL.md frontmatter travels with the skill wherever it is copied. [6]
Claude Opus 4.7 is now available in the GitHub Copilot model picker for Pro+, Business, and Enterprise users across VS Code, Visual Studio, Copilot CLI, Cloud Agent, github.com, Mobile, JetBrains, Xcode, and Eclipse. Opus 4.7 replaces Opus 4.5 and 4.6 for Copilot Pro+ users over the coming weeks. Promotional pricing sets the premium request multiplier at 7.5× until April 30. Enterprise and Business administrators must enable the Claude Opus 4.7 policy in Copilot settings. [4]
OpenCode v1.4.7 adds Claude Opus 4.7 xhigh adaptive reasoning, fixes Cloudflare AI Gateway compatibility for OpenAI reasoning models, and passes auth context to workspaces. GitHub Copilot gpt-5-mini now uses low reasoning effort for better request compatibility. Azure models default to store=true to fix stored-response requirements. The bash tool uses less memory on large command output, and sessions now retry provider 5xx errors even when the provider SDK does not mark them retryable. [22]