AI Coding News

May 24, 2026

Key Signals

Copilot SDK v1.0.0-beta.7 delivers canvas runtime, Java SDK, and cloud-backed remote sessions, marking the SDK's most feature-dense release to date. Applications can now declare interactive UI surfaces hosted by the Copilot runtime, enable remote sessions at the client or session level without a local CLI process, and intercept MCP tool invocations via a new preMcpToolCall hook. The addition of a full Java SDK brings Copilot's programmatic agent story to five languages, significantly broadening enterprise reach. Cloud session config decouples agent-building from local tooling, signaling GitHub's push toward hosted agentic workflows. [1]
AWS MCP Server reaches general availability, giving AI coding agents authenticated, auditable access to every AWS API through a single Model Context Protocol endpoint. Since its re:Invent preview, AWS expanded coverage to all APIs including long-running operations and file uploads, added sandboxed Python execution for multi-step tasks, and wrapped everything in IAM/SigV4 authentication with CloudTrail logging. The server integrates with Claude Code, Kiro, Cursor, and Codex, and is part of the new open-source Agent Toolkit for AWS. This positions AWS as a first-party infrastructure layer for agentic development, though availability is currently limited to two regions. [2]
ClickHouse engineering leadership shares a year of deploying AI coding agents on a large C++ codebase, reporting 700 agent-assisted PRs in two months that reduced flaky test findings from ~200/day to 3–5 per 10M runs. The team found agents became production-viable for C++ after Claude Opus 4.5 (November 2025), built their own code review bot using Copilot CLI, and now deploy autonomous agents for PR creation and edge-case discovery. This is one of the most detailed public accounts of systematic AI agent adoption at scale in a non-trivial codebase. [3]
Google introduces a middleware architecture for Genkit that adds programmable interception around model calls, tool execution, and generation loops — reflecting an industry-wide move toward runtime controls for autonomous systems. Prebuilt components include retry with exponential backoff, model fallbacks, approval gates for sensitive tool calls, and filesystem access controls. Google positions Genkit for adding agentic features to existing apps, while ADK targets standalone multi-agent orchestration. [4]
Security researchers demonstrate that AI chatbot vulnerabilities are shifting from technical exploits to psychological manipulation, with implications for the AI coding agents now handling real-world tasks. Mindgard "gaslit" Claude into producing prohibited material using conversational pressure rather than code injection. As agents increasingly book meetings, manage infrastructure, and handle customer service, the same social engineering techniques could compromise agentic systems that respond to natural language instructions. [5]
George Hotz publishes a contrarian essay arguing AI coding agents produce "slop" — code that is statistically similar to good code but broken in undetectable ways — and predicts large organizations will be most damaged. After six months of serious agent usage including writing parts of tinygrad, he concludes agents "frontload progress then give you a slot machine lever," and that weaker engineers at large orgs lack the error-correction instinct to distinguish plausible output from correct output. The piece crystallizes the growing tension between productivity-gain narratives and quality-skeptic perspectives in the AI coding community. [6]

AI Coding News

The operational observability gap for multi-agent systems in production is widening, with teams deploying AI agents having less visibility than they had for microservices a decade ago. Requests that should take one or two steps turn into dozens of model calls as agents bounce off each other, retrying and rephrasing without triggering alerts. Data propagation across agent boundaries — one agent reads sensitive information, another summarizes it, a third sends it to an external model — creates security risks that no single component reveals. The article argues monitoring must shift from static rules to behavioral baselines that detect when an agent's execution graph deviates from its typical patterns. [7]
Google Cloud's own AI security gaps undercut its platform-security messaging, with API keys remaining exploitable for up to 23 minutes after deletion and automated billing tier upgrades exposing developers to five-figure unauthorized charges. Security firm Aikido found that attackers can continue authenticating with deleted Google API keys during a propagation window, with over 90% success rates in some minutes. Multiple developers reported bills exceeding $10,000 in minutes after API keys originally scoped to Google Maps were silently expanded to cover Gemini. Google Cloud COO Francis de Souza separately warned that AI agents roaming enterprise systems will surface forgotten data repositories with outdated access controls. [8]
AI chatbot security is evolving into a "psychocybersecurity" discipline where hackers use conversational psychology — flattery, gaslighting, sustained pressure — rather than code exploits to break AI systems. Mindgard now profiles models "like interrogators profile suspects," noting different susceptibilities across models: some cave to flattery, others to sustained pressure. The attack surface extends beyond chatbots to AI agents handling real-world tasks — booking systems, customer service, internal copilots — where social manipulation could cause agents to cross authorization boundaries. [5]
George Hotz argues the industry-wide adoption of AI coding agents constitutes "one of the most costly mistakes in the field's history," positioning himself in the LeCun/Marcus camp that LLMs fundamentally cannot program. After extensive personal experimentation including reverse-engineering hardware with agents, he concludes that agent output mimics the statistical distribution of programming without reproducing its underlying logic. He predicts that bottom-performing engineers at large organizations — who lack error-correction instincts — will produce the most agent-assisted code, systematically degrading average output quality across the industry. [6]

Feature Update

GitHub Copilot SDK v1.0.0-beta.7 ships canvas runtime support, remote sessions, cloud session config, Java SDK, preMcpToolCall hook, SDK tracing diagnostics, and a C# CopilotTool helper across all five language SDKs. The canvas runtime lets applications declare interactive UI surfaces with canvas.open, canvas.close, and canvas.action.invoke events. Remote sessions can be enabled always-on at the client level or on-demand mid-session. Cloud session config allows creating cloud-backed sessions with repository metadata without a local CLI process. The Rust SDK now bundles the Copilot CLI binary by default. All hook input types now include sessionId for distinguishing parent sessions from sub-agents. Additional improvements include enableSessionTelemetry, runtime_instructions system message sections, SessionFs SQLite support, and API review fixes across TypeScript, C#, Go, Rust, and Python. [1]
GitHub Copilot CLI v1.0.53 fixes multiline prompt display, /skills preference persistence, and Bash session hangs. Multiline prompts now display fully without content clipping or selection offset. The /skills picker correctly honors --config-dir when saving skill preferences. Bash shell sessions no longer hang when PS0 or PROMPT_COMMAND is set in the user's environment. [9]
GitHub Copilot CLI v1.0.54 ships as a follow-up patch with additional fixes. Released the same day as v1.0.53, this minor release includes unspecified fixes and changes. Three additional prerelease builds (v1.0.53-0, v1.0.53-1, v1.0.53-2) preceded the stable releases on the same day. [10]
AWS MCP Server reaches general availability with full AWS API coverage, sandboxed Python execution, and IAM-based governance for AI coding agents. The server is part of the new open-source Agent Toolkit for AWS and integrates with Claude Code, Kiro, Cursor, and Codex via standard MCP configuration. Documentation search and skill discovery work without AWS credentials; API access requires IAM/SigV4 authentication through the MCP Proxy for AWS. Available in Northern Virginia and Frankfurt regions, free to use with standard resource charges. [2]
Google Genkit adds a middleware architecture with programmable interception at three levels: generation, model calls, and tool execution. Prebuilt middleware components include retry handling with exponential backoff, automatic model fallbacks, approval gates for sensitive tool calls, filesystem access controls, and a "skills" system for dynamic instruction injection. Middleware components stack in defined execution order and integrate with the Genkit Developer UI for tracing and debugging. Supports TypeScript, Go, and Dart with Python coming soon. [4]