AI Coding News

May 29, 2026

Key Signals

GitHub now tracks developer AI adoption maturity through a four-phase cohort model in its Copilot usage metrics API. The new ai_adoption_phase field classifies users into "Code first", "Agent first", and "Multi-agent" based on rolling 28-day activity. This gives enterprise admins their first structured way to measure not just whether developers use Copilot, but how deeply they've adopted agentic workflows — enabling targeted enablement programs where adoption gaps are largest. [1]
Cursor ships "Auto-review," a new run mode that uses a classifier subagent to autonomously approve, sandbox, or escalate tool calls. Allowlisted Shell, MCP, and Fetch calls execute immediately; sandboxable calls run in isolation; everything else goes to a classifier that decides whether to proceed, try a different approach, or ask the user. This represents a meaningful step toward longer-running autonomous agents with safety guardrails that don't require constant human approval. [2]
Claude Code v2.1.157 introduces an auto-loading plugin system and claude plugin init scaffolding, lowering the barrier to extending the CLI with custom skills. Plugins in .claude/skills directories are now automatically loaded without a marketplace, and the new init command scaffolds a plugin in place. Combined with worktree management improvements and tool_decision telemetry, this release pushes Claude Code further toward a customizable agentic development platform. [3]
Developers refuse to work without AI despite mounting evidence that AI-generated code increases maintenance costs. METR couldn't repeat its productivity study because developers wouldn't participate without AI tools. Meanwhile, Amazon shut down its internal "Kirorank" token-tracking leaderboard after employees gamed it, Uber blew its 2026 AI budget in four months without measurable productivity gains, and Singapore Management University researchers warn AI code introduces long-term maintenance costs. The gap between perceived and actual productivity remains a critical industry challenge. [5]
GitHub achieves up to 62% token cost reduction in agentic CI workflows by pruning unused MCP tools and deploying daily auditor/optimizer agents. The team found that a GitHub MCP server with 40 tools adds 10–15 KB of schema per turn; removing unused entries cut per-call context by 8–12 KB. They also replaced MCP calls with gh CLI commands and introduced an "Effective Tokens" metric that normalizes cost across model tiers. The audit-and-optimize loop ships in the gh-aw CLI. [7]
Linus Torvalds argues AI is a productivity tool comparable to compilers — not a replacement for understanding systems — while warning that "drive-by" AI bug reports are burning out open-source maintainers. At Open Source Summit NA, Torvalds noted a 20% increase in Linux kernel submissions due to AI but emphasized that companies are using AI to flag bugs for publicity without providing patches. He calculates AI boosts productivity ~10x, still 100x less than the gains compilers historically delivered. [8]
Cognition raises $1B at $26B valuation for Devin, but CEO Scott Wu insists AI coding agents shouldn't replace human programmers. Wu rates Devin's current ability at "somewhere between a junior and mid-level engineer" and says 89% of Cognition's own code commits come from Devin — primarily on long-tail maintenance tasks like platform migrations. He frames agents as another abstraction layer, freeing developers for creative architecture work rather than eliminating their roles. [6]

AI Coding News

Snyk launches Evo Continuous Offensive Security, an AI-powered penetration testing product addressing the security gap created by AI-generated code shipping faster than traditional testing cycles. The system uses deterministic scanning for known vulnerability classes and reserves LLM reasoning for context-dependent business logic flaws and authorization bypasses. Forrester analyst Janet Worthington notes enterprises are compressing development cycles from weeks to hours with AI coding agents, making continuous AI pentesting "a critical solution." The product includes Agent Red Teaming for LLM-integrated applications and delivers results as exploit chains rather than alert lists. GA is targeted for Black Hat USA in August 2026. [9]
OpenAI publishes a case study on how Braintrust engineers use Codex with GPT-5.5 to turn customer requests into code and run experiments faster. The case study details integration of OpenAI's Codex coding agent into development workflows to accelerate feature delivery and experimentation cycles, continuing OpenAI's push to showcase enterprise Codex adoption alongside recent Endava and Cisco partnerships. [10]

Feature Update

GitHub Copilot CLI v1.0.56 delivers a major update with model picker access for Free/Student tiers, a redesigned diff view with theme-aware colors, and smarter MCP tool handling. Free and Student users can now select models beyond Auto; the diff view gains a continuous scroll layout with sticky file and hunk headers; web_fetch prefers markdown via content negotiation; the code review agent inherits the session's model; and the GitHub MCP server automatically omits gh-replaceable tools to reduce token usage. Atomic config writes prevent data loss when multiple CLI processes run concurrently. [11]
GitHub Copilot CLI v1.0.57-0 improves /diff to default to branch diff when no unstaged changes exist, and surfaces the real reason behind auth-token validation failures. Previously, SDK auth failures showed a misleading "Session was not created with authentication info" message; now the underlying cause is displayed. [12]
GitHub Copilot SDK v1.0.0-beta.10 fixes a .NET stderr pump race condition and exposes install_bundled_cli / HAS_BUNDLED_CLI in the Rust SDK. The .NET fix prevents TaskScheduler.UnobservedTaskException during shutdown by coordinating stderr pump cleanup with process disposal. The Rust APIs let consumers access the bundled CLI path before a Client exists, eliminating duplicated cache-path resolution logic. The release also adds displayPrompt support to session.send across all SDKs, MCP Apps (SEP-1865) support, mcpOAuthTokenStorage, and granular per-session flags for multitenancy hardening. [13][14]
Cursor introduces the "Auto-review" run mode for longer autonomous agent sessions with fewer interruptions. Shell, MCP, and Fetch tool calls are routed through allowlists, sandboxing, and a classifier subagent that decides whether to approve, reroute, or escalate each action. Users can configure the run mode in Settings and provide custom instructions to steer the classifier's behavior. [2]
Claude Code v2.1.157 adds plugin auto-loading from .claude/skills, claude plugin init scaffolding, and autocomplete for /plugin arguments. The agent field in settings.json is now honored for dispatched sessions, EnterWorktree can switch between Claude-managed worktrees mid-session, and tool_decision telemetry events include tool parameters when OTEL_LOG_TOOL_DETAILS=1. Bug fixes address sandbox permission prompts in auto mode, background agent worktree orphaning, and clipboard issues in VS Code/Cursor/Windsurf integrated terminals. [3]
Claude Code v2.1.156 is a hotfix resolving API errors when using Opus 4.8 where thinking blocks were incorrectly modified. This one-line fix ensures compatibility with Anthropic's latest Opus 4.8 model, which became the default for Claude Code in the previous day's v2.1.154 release. [4]
OpenAI Codex v0.135.0 adds codex doctor diagnostics, Vim text-object editing, named permission profiles, Python SDK sandbox presets, and non-interactive installation. codex doctor now reports environment, Git, terminal, app-server, and thread inventory diagnostics for support cases; /status shows remote connection details; /permissions understands named profiles and custom configurations; and install.sh/install.ps1 support CODEX_NON_INTERACTIVE=1. TUI improvements include app-style markdown table rendering and stability fixes for macOS and Zellij. [15]
Gemini CLI ships a nightly build (v0.45.0-nightly.20260529) hardening PTY resize against native crashes. The fix in #27496 prevents crashes when the terminal is resized during active sessions. A separate fix prevents spam loops when preferredEditor is configured to an invalid value. [16]