AI Coding News

February 19, 2026

Key Signals

Google's Gemini 3.1 Pro arrives with record reasoning benchmarks and immediately lands in GitHub Copilot. The model more than doubles the previous Gemini Pro's ARC-AGI-2 score (31.1% → 77.1%), beating Opus 4.6 (68.8%) and GPT-5.2 (52.9%), while scoring a record 44.4% on Humanity's Last Exam. At $2/$12 per million input/output tokens—less than half the price of Anthropic's Opus 4.6—it is now the best-value frontier coding model. GitHub is rolling it out across VS Code, Visual Studio, github.com, and GitHub Mobile for all paid Copilot tiers. [1][2][3]
GitHub Copilot's coding agent model picker expands to Business and Enterprise, giving teams control over which frontier model powers autonomous PR generation. Users can now choose from seven models including Claude Opus 4.5/4.6, Claude Sonnet 4.5/4.6, GPT-5.1-Codex-Max, GPT-5.2-Codex, and GPT-5.3-Codex when delegating tasks to Copilot's background agent. The "Auto" mode optimizes for speed and availability. This is a significant shift toward model transparency in enterprise AI coding workflows. [4]
Copilot CLI v0.0.412 introduces cross-session memory and enhanced fleet orchestration, signaling a move toward persistent, multi-agent terminal workflows. The experimental cross-session memory feature lets developers query past work, files, and PRs across sessions, addressing one of the most requested capabilities for CLI-based AI agents. The /fleet orchestrator now validates subagent work and dispatches more subagents in parallel, while the new exit_plan_mode tool adds a plan approval dialog for reviewing and accepting plans before execution. [7]
Claude Code v2.1.49 completes the Sonnet 4.5 → 4.6 migration on the Max plan and fixes critical memory leaks in long-running sessions. Anthropic is removing Sonnet 4.5 with 1M context in favor of Sonnet 4.6 (which now has 1M context), nudging users toward their latest frontier model. Two separate unbounded memory growth bugs—in the tree-sitter parser and Yoga layout engine—were fixed, directly addressing stability complaints from developers running extended coding sessions. A new ConfigChange hook event enables enterprise security teams to audit and optionally block settings changes mid-session. [8]
OpenClaw security concerns escalate as Meta and other tech firms issue workplace bans, highlighting the trust gap in agentic AI tooling. A Meta executive reportedly threatened termination for employees using OpenClaw on work laptops, while Valere's research team found the tool "pretty good at cleaning up some of its actions, which also scares me." The bans underscore a fundamental tension: OpenClaw is highly capable but wildly unpredictable, and its ability to access cloud services, GitHub codebases, and sensitive data makes it a serious enterprise security risk. Security researchers recommend strict sandboxing and accepting that "the bot can be tricked." [9]
OpenCode ships three releases in one day (v1.2.6–v1.2.8), adding Gemini 3.1 reasoning support and Claude Sonnet 4.6 adaptive thinking. The rapid iteration pace—driven by 25+ community contributors in v1.2.7 alone—reflects the open-source AI coding tool's aggressive posture against proprietary competitors. The v1.2.7 release is particularly notable for its wholesale migration from Bun.file() to a centralized Filesystem module, dramatically improving Node.js compatibility and signaling a shift away from Bun-specific APIs. [10][11]
GitHub deprecates Claude Opus 4.1, GPT-5, and GPT-5-Codex across all Copilot experiences, accelerating the model freshness cycle. The retirement of three models simultaneously—affecting Chat, inline edits, ask/agent modes, and code completions—pushes users toward Claude Opus 4.6, GPT-5.2, and GPT-5.2-Codex respectively. Enterprise administrators must proactively enable replacement models through Copilot settings policies, making this a non-trivial operational change for large organizations. [6]

AI Coding News

Google launches Gemini 3.1 Pro with dramatically improved reasoning, but the Arena leaderboard tells a more nuanced story. While Gemini 3.1 Pro dominates most benchmarks—including ARC-AGI-2 at 77.1% and a record 44.4% on Humanity's Last Exam—Claude Opus 4.6 still leads the Arena leaderboard for both text and code by a notable margin. The model's core intelligence derives directly from Gemini 3 Deep Think, and it leads Terminal-Bench 2.0 for agentic coding, though OpenAI's 5.3-Codex reports a higher score when using its own harness. Google prices it at $2/$12 per million tokens, making it significantly cheaper than Opus 4.6 at $5/$25, while offering a 1M-token context window. [2][3]
OpenClaw's security risks trigger corporate bans at Meta and other firms, spotlighting the governance vacuum around agentic AI tools. Multiple companies have moved to ban or restrict OpenClaw after cybersecurity professionals warned it can access cloud services, GitHub codebases, and sensitive data on developers' machines. Valere's research team tested the tool in an isolated environment and concluded that users must "accept that the bot can be tricked"—for example, a malicious email could instruct the AI to exfiltrate files. The bans represent a growing pattern of companies prioritizing security over experimentation with novel agentic tools. [9]

Feature Update

GitHub Copilot in Zed reaches general availability through a formal partnership, expanding Copilot's editor ecosystem beyond VS Code and Visual Studio. All developers with paid Copilot subscriptions can now authenticate into the high-performance Rust-based editor using their existing Copilot credentials—no additional AI license required. Zed, built by the creators of Atom and Tree-sitter, offers Copilot Chat integration that can be configured directly from Zed's settings. [5]
GitHub Copilot coding agent model picker now available for Business and Enterprise users, offering seven model choices for autonomous task delegation. Previously limited to Pro and Pro+ users, the model picker lets organizations choose from Claude Opus 4.5/4.6, Claude Sonnet 4.5/4.6, GPT-5.1-Codex-Max, GPT-5.2-Codex, and GPT-5.3-Codex when assigning issues to Copilot on github.com, GitHub Mobile, or the Raycast launcher. If administrators haven't enabled any models, Claude Sonnet 4.6 is used automatically as the default. [4]
GitHub deprecates Claude Opus 4.1, GPT-5, and GPT-5-Codex across all Copilot experiences effective February 17. Users are directed to migrate to Claude Opus 4.6, GPT-5.2, and GPT-5.2-Codex respectively. Enterprise administrators may need to enable access to alternative models through their Copilot model policies—no action is needed to remove the deprecated models, but the replacements must be explicitly enabled. [6]
Gemini 3.1 Pro is now in public preview in GitHub Copilot, available across VS Code, Visual Studio, github.com, and GitHub Mobile. Described as Google's latest agentic coding model, it excels at edit-then-test loops with high tool precision and achieves strong resolution success with fewer tool calls per benchmark. Copilot Business and Enterprise administrators must enable the Gemini 3.1 Pro policy in settings; rollout is gradual. [1]
GitHub adds PR throughput and time-to-merge metrics to the Copilot usage metrics API, giving enterprises visibility into Copilot's impact on development velocity. The new enterprise-level API metrics cover pull request review suggestion acceptance, Copilot coding agent-created PRs that got merged, and PR cycle time. This lets engineering leaders quantify Copilot's contribution from code suggestion through merged pull request. [13]
Copilot CLI v0.0.412 ships with 35+ changes including experimental cross-session memory, fleet orchestration improvements, and plan editing. The headline feature is cross-session memory, letting developers ask about past work, files, and PRs across sessions. The /fleet orchestrator now validates subagent work and dispatches more subagents in parallel. Other notable additions include /mcp reload, /update command, exit_plan_mode with a plan approval dialog, user-level instructions via ~/.copilot/instructions/, and terminal editor support on Windows. GPT-5 model is deprecated. [7][14]
Claude Code v2.1.49 migrates Max plan users from Sonnet 4.5 to Sonnet 4.6 (now with 1M context) and fixes critical memory leaks. Two unbounded WASM memory growth bugs were patched—one in the tree-sitter parser and another in the Yoga layout engine—resolving stability issues in long-running sessions. Startup performance improves via cached MCP auth failures and batched token counting. New SDK fields enable consumers to discover model capabilities programmatically. The new ConfigChange hook event supports enterprise security auditing of settings changes. [8]
OpenAI Codex releases three alpha builds (v0.105.0-alpha.4/5/6), continuing rapid iteration on the Rust-based agent. These pre-release builds carry no detailed changelogs, following the v0.104.0 stable release from February 18 that introduced websocket proxy support, thread archive/unarchive notifications for the app-server, and distinct approval IDs for command approvals in multi-step shell flows. [15]
OpenCode v1.2.6–v1.2.8 deliver Gemini 3.1 reasoning support, Sonnet 4.6 adaptive thinking, and a major Bun-to-Node migration. v1.2.7 is the largest release with 100+ changes and 25 community contributors, including a wholesale migration from Bun.file() to a centralized Filesystem module, Kilo as a native provider, and Julia language server support. v1.2.8 adds adaptive thinking for Claude Sonnet 4.6 and collapsible MCP tool responses in the TUI. v1.2.6 adds D and Clojure formatter support, OpenAI-compatible endpoints for Google Vertex, and JSON-to-SQLite storage migration. [10][11][16]
Gemini CLI releases seven versions on February 19 (v0.29.2–v0.29.5 stable patches, v0.30.0-preview.1–3), building on the major v0.30.0 preview series. The v0.30.0 preview line introduces a Gemini CLI SDK with SessionContext and dynamic system instructions, a policy engine replacing --allowed-tools, a formalized 5-phase sequential planning workflow, Ctrl-Z suspension support, custom reasoning models by default, and 30-day session retention. The February 19 releases are cherry-pick patches stabilizing these features. [12][17]