Résumé IA
Anthropic a lancé Claude Cowork, une réponse directe à OpenClaw qui est saluée favorablement par des personnalités comme Simon Willison et Ethan Mollick. OpenAI a simultanément déployé GPT-5.4 mini et nano, des modèles compacts 2x plus rapides que GPT-5 mini, avec une fenêtre de contexte de 400k tokens et ciblant le code, les agents et l'utilisation multimodale — mais à des prix plus élevés (0,75 $/M tokens en entrée pour mini). L'infrastructure agentique s'impose comme le nouveau centre de gravité du secteur, avec une course aux sandbox sécurisés, à l'orchestration et aux outils de déploiement au-delà des seuls modèles de base.
Note: AIE Europe is ~sold out! Tickets and limited sponsorships for AIE Miami are next — as you can see from online buzz , speakers are excited and prepping. We’ll be there! By total coincidence, today’s main pod guest also released today’s title story: swyx : Does remote control work for Claude Cowork yet? No. Right. Felix : Excellent question. swyx : Coming soon. And today, here it is : Multiple people, from SimonW to Ethan Mollick , are comparing it (favorably) to OpenClaw. As Jensen said yesterday , every company needs an OpenClaw strategy. Now Anthropic, which famously “ fumbled ” the Clawdbot relationship, has an answer, and it’s a pretty pretty good one. Tune in to the full pod to get the full origin story, usecases, and design thinking (particularly around the technical choices of sandboxing and Electron), on today’s pod. AI News for 3/14/2026-3/16/2026. We checked 12 subreddits, 544 Twitters and no further Discords. AINews’ website lets you search all past issues. As a reminder, AINews is now a section of Latent Space . You can opt in/out of email frequencies! AI Twitter Recap OpenAI’s GPT-5.4 Mini/Nano Release and the Shift to Small, Coding-Optimized Models GPT-5.4 mini and nano shipped across API, ChatGPT, and Codex : OpenAI launched GPT-5.4 mini and GPT-5.4 nano , positioning them as its most capable small models yet. Per @OpenAIDevs , GPT-5.4 mini is more than 2x faster than GPT-5 mini, targets coding, computer use, multimodal understanding, and subagents , and offers a 400k context window in the API. OpenAI also claims mini approaches larger GPT-5.4 performance on evaluations including SWE-Bench Pro and OSWorld-Verified , while using only 30% of GPT-5.4 Codex quota , making it the new default for many background coding workflows and subagent fan-out. Early reception focused on coding value, but also on pricing and truthfulness tradeoffs : Developers immediately highlighted mini’s utility for subagents in Codex , computer-use workloads , and external products such as Windsurf . However, commentary also converged on a familiar OpenAI pattern: better performance but higher price. Posts from @scaling01 note $0.75/M input and $4.5/M output for mini, with nano likewise priced above prior nano tiers. Third-party evals were mixed: Mercor’s APEX-Agents result reported 24.5% Pass@1 for mini with xhigh reasoning, ahead of some lightweight and midweight competitors on that benchmark, while BullshitBench placed the new small models relatively low on resistance to false-premise/jargon traps. OpenAI also quietly acknowledged behavior tuning issues, with @michpokrass saying a recent 5.3 instant update reduced “annoyingly clickbait-y” behavior. Agent Infrastructure: Sandboxes, Subagents, Open SWE, and the Harness Wars Code-executing agents are becoming the center of product architecture : Several launches point to a stack maturing around secure execution, orchestration, and deployment ergonomics rather than just better base models. LangChain introduced LangSmith Sandboxes for secure ephemeral code execution, with @hwchase17 explicitly arguing that “more and more agents will write and execute code.” In parallel, LangChain open-sourced Open SWE , a background coding agent patterned after internal systems reportedly used at Stripe, Ramp, and Coinbase . The system integrates with Slack, Linear, and GitHub , uses subagents plus middleware, and separates harness, sandbox, invocation layer, and validation. This is a notable step from “chat copilots” toward deployable internal engineering agents. Subagents and secure execution are now first-class product features across the ecosystem : OpenAI’s Codex now supports subagents , and GPT-5.4 mini was framed by OpenAI as especially good for that use case. Hermes Agent’s v0.3.0 release is another strong signal: 248 PRs in 5 days , first-class plugin architecture , live Chrome control via CDP , IDE integrations, local Whisper-based voice mode, PII redaction, and provider integrations like Browser Use . The resulting direction is consistent across vendors: agent value increasingly depends on safe execution environments, composable skills/plugins, and workflow-native surfaces rather than raw benchmark gains alone. Architecture Research: Attention Residuals, Vertical Attention, and Mamba-3 Attention over depth is having a moment : Moonshot’s Attention Residuals paper on arXiv triggered substantial technical discussion around “vertical attention” or attention across layers. A detailed explainer from @ZhihuFrontier frames the idea as each layer querying prior-layer states, effectively extending attention from horizontal sequence interactions to inter-layer memory. Community reactions emphasized that this is not entirely isolated: @rosinality noted ByteDance also implemented attention over depth , and @arjunkocher published an implementation walkthrough. The interesting systems claim here is that because number of layers << sequence length , some forms of vertical attention may be