Résumé IA
Au GTC de NVIDIA, Jensen Huang a présenté les architectures Blackwell et Rubin en forte croissance, dévoilé le CPU Vera, et annoncé un carnet de commandes estimé à 1 000 milliards de dollars pour 2027, tout en lançant NemoClaw comme réponse aux failles de sécurité d'OpenClaw. En parallèle, Moonshot (Kimi) a publié un papier sur les "Attention Residuals" promettant un avantage de calcul de 1,25x avec moins de 2 % de surcoût à l'inférence, validé sur le modèle Kimi Linear 48B, bien que la nouveauté de l'approche soit débattue. Du côté d'OpenAI, Codex dépasse 2 millions d'utilisateurs actifs hebdomadaires (+4x depuis janvier), tandis que GPT-5.4 a atteint 5 000 milliards de tokens par jour et un milliard de dollars de revenus annualisés en une semaine.
Impact France/UELes laboratoires et entreprises européens déployant des infrastructures IA devront intégrer les nouvelles architectures NVIDIA (Vera CPU, Rubin) dans leur feuille de route hardware, avec des implications budgétaires majeures sur les prochains cycles d'investissement.
It is NVIDIA GTC day again, and over his signature 2hr unrehearsed keynote , Jensen gave updates on the entire NVIDIA universe and ecosystem and celebrated his InferenceMAX champions belt . As one might expect, Blackwell and Rubin are selling very very well (some accounting is necesary ), and now Vera : The final section of the keynote was focused on OpenClaw, where Jensen went extremely hard in complimenting it and then pointed out the security issues, then pitched his solution, NemoClaw : NVIDIA moves at impressive speed for a $4T company, and we had some of their next generation leaders on the pod to give more insight on how NVIDIA works this fast: AI News for 3/14/2026-3/16/2026. We checked 12 subreddits, 544 Twitters and no further Discords. AINews’ website lets you search all past issues. As a reminder, AINews is now a section of Latent Space . You can opt in/out of email frequencies! AI Twitter Recap Architecture Research: Moonshot’s Attention Residuals and the Debate Around Prior Art Moonshot’s Attention Residuals paper was the clearest technical story in the feed : @Kimi_Moonshot introduced a replacement for fixed residual accumulation with input-dependent attention over prior layers , plus Block AttnRes to keep cross-layer attention practical. Claimed results: 1.25x compute advantage , <2% inference latency overhead , validated on Kimi Linear 48B total / 3B active ; follow-up posts highlighted improved hidden-state magnitude control and more uniform gradients across depth ( paper thread , paper link ). The release triggered strong positive reactions from practitioners and researchers including @Yuchenj_UW , @elonmusk , @nathancgy4 , and multiple visual explainers such as @eliebakouch and @tokenbender . The interesting second-order discussion was whether this is new, or “new at scale” : @behrouz_ali argued the idea substantially overlaps with prior work like DeepCrossAttention , criticizing missing citations and broader ML novelty inflation; @cloneofsimo made a similar point that Google had explored related ideas earlier, while others countered that the systems work and scaling evidence matter as much as the core intuition ( context , more context ). Net: the paper mattered both as an architectural proposal and as a live example of the field’s ongoing tension between idea novelty , citation quality , and frontier-scale validation . Coding Agents, Harnesses, and Skills Infrastructure OpenAI’s Codex momentum showed up repeatedly : OpenAI Devs promoted a Codex x Notion event , while company posts and leadership commentary emphasized fast adoption. @fidjissimo said Codex is at 2M+ weekly active users , up nearly 4x YTD , with OpenAI also building a deployment arm for enterprise rollout. @sama added that “hardcore builders” are switching to Codex, and @gdb said GPT-5.4 reached 5T tokens/day within a week and a $1B annualized run-rate in net-new revenue . Product-wise, Codex also added subagents , reinforcing the shift toward multi-agent coding workflows. The infrastructure layer around coding agents is maturing fast : @AndrewYNg expanded Context Hub / chub , an open CLI for current API docs that now supports agent feedback loops on documentation. @AssemblyAI shipped a maintained skill for Claude Code, Codex, Cursor, and compatible agents so they can use current API patterns rather than stale training priors. @dair_ai highlighted a paper on automated extraction of agent skills from GitHub repos into standardized SKILL.md , with claimed 40% knowledge-transfer gains . Together these point toward a new agent tooling stack: skills files, up-to-date docs, feedback channels, and repo-mined procedural knowledge . LangChain pushed further into “agent harness engineering” : @LangChain launched LangGraph CLI for terminal-based deploy/dev flows, and the ecosystem open-sourced Deep Agents , framed by @itsafiz and @simplifyinAI as an MIT-licensed recreation of the workflow behind top coding agents: planning/todos, filesystem ops, shell access, sub-agents, and context management. Internally, @Vtrivedy10 said this is also the base for production agent work and evals. The notable pattern is that teams are no longer just shipping models; they’re shipping reference harnesses . Open-Source Agents: Hermes’ Breakout, OpenClaw Integrations, and Agent UX Hermes Agent had a strong community cycle : hackathon projects spanned home media automation ( @rodmarkun’s anime server tool ), cyber tooling ( @aylacroft ), geopolitics/OSINT forecasting ( @WeXBT ), and research visualization ( @t105add4_13 ). User sentiment was consistently that Hermes is easier to set up and more robust than OpenClaw: see @Zeneca , @fuckyourputs , @austin_hurwitz , and @0xMasonH . @Teknium also posted setup guides like enabling Honcho memory . OpenClaw still expanded its ecosystem despite the Hermes comparisons : @ollama announced Ollama as an official provider for OpenClaw; Comet launched an observability plugin for tracing calls/tools/costs; and there