AI Coding News

March 2, 2026

Key Signals

GitHub is deprecating five AI models across all Copilot experiences, including Gemini 3 Pro and the entire GPT-5.1 family. Gemini 3 Pro will be removed on March 26 due to an accelerated provider deprecation from Google, while GPT-5.1, GPT-5.1-Codex, GPT-5.1-Codex-Mini, and GPT-5.1-Codex-Max are all scheduled for retirement on April 1. The recommended migration path points to Gemini 3.1 Pro and GPT-5.3-Codex respectively. Enterprise administrators should enable the alternative models in their Copilot policies before these dates to avoid disruption. [1]
OpenAI Codex v0.107.0 introduces thread forking into sub-agents, marking a meaningful step toward multi-agent workflows inside the terminal. This release also makes memories configurable with a new codex debug clear-memories command, and enables custom tools to return multimodal output including images. The thread/start path has been made non-blocking and sandbox security was tightened by restricting read access to sensitive directories like ~/.ssh on Windows. [2]
Copilot CLI v0.0.421 ships two pre-releases adding repo-level configuration and ACP reasoning effort control. Teams can now share project settings like marketplaces and launch messages via .github/copilot/config.json, making Copilot CLI behavior consistent across collaborators. The release also adds clickable terminal links, improved Markdown table rendering with Unicode borders, and fixes for streaming truncation and Windows paste issues. [3][4]
Anthropic brings Claude's memory feature to free users and launches a dedicated import tool targeting ChatGPT and Gemini switchers. The Claude iOS app has reached #1 on the U.S. App Store, with free-plan users up 60% since January and paid subscribers doubling. Anthropic provides a specific prompt that users can paste into their current chatbot to export their memory data for import into Claude, lowering the switching barrier at a moment when anti-OpenAI sentiment is driving user migration. [5][6]
GitHub Copilot metrics now includes plan mode telemetry, closing an enterprise observability gap. Organizations can now track plan mode adoption and engagement trends across JetBrains, Eclipse, Xcode, and VS Code Insiders, with general VS Code support expected soon. The API exposes plan mode usage under the chat_panel_plan_mode key, and the dashboard UI shows breakdowns for request and model usage. Previously, plan mode telemetry was mixed into the "Custom" category. [7]

AI Coding News

Anthropic expands Claude's memory to free-tier users and adds a chatbot import tool, capitalizing on a surge of new users. Memory was previously limited to Pro, Max, Team, and Enterprise plans starting at $20/month, but is now available to all Claude users through the settings menu. The new import tool lets users copy a pre-written prompt into ChatGPT or Gemini to extract their stored preferences, personal details, and behavioral instructions, then paste the output into Claude's importer. This move comes as Claude sees unprecedented consumer momentum — its iOS app hit #1 on the U.S. App Store, and the company recently moved connectors, file creation, and skills to the free tier as well. [5][6]
InfoQ analyzes the real-world performance claims of GPT-5.3-Codex-Spark, OpenAI's first model deployed on Cerebras wafer-scale chips. The model runs at roughly 1,000 tokens per second — a claimed 15x speed-up — with OpenAI having reduced roundtrip overhead by 80%, per-token processing time by 30%, and time-to-first-token by 50% through persistent WebSocket connections and inference stack rewrites. However, independent benchmarking by Nicholas Van Landschoot measured actual improvements closer to 1.37x in practical use, with the 15x figure derived from comparison against Codex's high-reasoning x-high configuration. Codex-Spark benchmarks between GPT-5.1-Codex-mini and GPT-5.3-Codex on SWE-Bench Pro, offering a trade-off of speed for depth. [8]
Inception Labs ships Mercury 2, the only production diffusion-based LLM, claiming 5–10x faster inference than autoregressive models from OpenAI, Anthropic, and Google. Unlike traditional LLMs that generate tokens sequentially, Mercury 2 starts with a rough answer and refines it in parallel, leveraging GPU parallel computation. CEO Stefano Ermon acknowledges that the model currently matches Claude Haiku and Google Flash-class quality rather than Opus-tier models, but argues the economics will improve with scale. Mercury 2 is available via an OpenAI-compatible API with AWS Bedrock integration coming soon, and Nvidia is an investor helping optimize the serving engine. [9]

Feature Update

OpenAI Codex v0.107.0 adds multi-agent thread forking, configurable memories, and multimodal custom tool outputs. Users can now fork a conversation thread into sub-agents, enabling branching work without leaving the current context. Custom tools can return structured content including images instead of just plain text. The TUI now shows richer model availability metadata with tooltips explaining plan-gated models. Bug fixes address thread/resume sync failures, duplicate assistant output in interactive sessions, and diff rendering issues in Windows Terminal. Sandbox hardening restricts read-only access to sensitive directories and ensures escalated shell commands retain their sandbox configuration on rerun. [2]
GitHub Copilot CLI v0.0.421-0 and v0.0.421-1 add repo-level config, ACP reasoning effort, and terminal UX improvements. The v0.0.421-0 pre-release introduces .github/copilot/config.json for shared project settings, ACP reasoning effort configuration via session options, and clickable links in the terminal. Markdown tables now render with proper column widths, word wrap, and Unicode borders that adapt to terminal width. The v0.0.421-1 follow-up improves the AUTO theme to read the terminal's ANSI color palette directly, so colors match any terminal theme automatically. Multiple Windows-specific fixes address garbled right-click paste, incorrect shell command output, and streaming truncation in alt-screen mode. [3][4]
GitHub announces upcoming deprecation of Gemini 3 Pro (March 26) and all GPT-5.1 models (April 1) across Copilot Chat, inline edits, ask and agent modes, and code completions. The Gemini 3 Pro timeline is accelerated due to Google's own provider deprecation schedule. GPT-5.1, GPT-5.1-Codex, GPT-5.1-Codex-Mini, and GPT-5.1-Codex-Max should all be replaced with GPT-5.3-Codex. Enterprise administrators need to enable alternative models through Copilot model policies before the deprecation dates; the deprecated models will be automatically removed with no further action required. [1]
Network configuration changes for GitHub Copilot coding agent are now in effect. Originally announced on February 13 with a February 27 effective date, these changes affect teams running Copilot coding agent in environments with firewalls or proxy restrictions. Organizations that have not updated their network configuration may experience connectivity issues with the coding agent. [10]
GitHub Copilot metrics now includes plan mode telemetry for enterprise, organization, and user-level tracking. The usage data is exposed via the API under the chat_panel_plan_mode key in totals_by_feature, totals_by_language_feature, and totals_by_model_feature fields. Supported IDEs include JetBrains, Eclipse, Xcode, and VS Code Insiders, with general VS Code support expected soon. Enterprises may notice a dip in "Custom" usage in dashboard charts since plan mode telemetry was previously mixed into that category. [7]
GitHub Copilot metrics reports now return consistent user_login values for Enterprise Managed Users. Previously, some reports included a suffix in the user_login field, making it harder to correlate data across different reports and analytics systems. This fix ensures consistent usernames across all Copilot metrics reports for EMU accounts. [11]