What's New in AI | SimpleAI

July 2026

Google DeepMind

Jul 2026

New Model

Please don't discontinue Gemini 2.5 Flash

A

AI

Jul 2026

Open Source

China may restrict foreign access to Chinese open-source AI models

Anthropic

Jul 2026

New Model

New serious vulnerabilities spiked around release of Claude Mythos Preview

A

AI

Jul 2026

New Feature

Launch HN: Context.dev (YC S26) – API to get structured data from any website

June 2026

A

AI

Jun 2026

New Model

General-purpose large language models outperform specialized clinical AI

Anthropic

Jun 2026

New Feature

Coding is solved: one guy reverse-engineered Claude Desktop for Linux via Claude

Anthropic

Jun 2026

New Model

If Claude Fable stops helping you, you'll never know

Anthropic

Jun 2026

New Model

Anthropic releases Claude Fable 5

OpenAI

Jun 2026

New Model

GPT-2: Too Dangerous To Release (2019)

A

AI

Jun 2026

New Feature

Launch HN: Intuned (YC S22) – Build and run reliable browser automations as code

Anthropic

Jun 2026

New Model

Anthropic, please ship an official Claude Desktop for Linux

Meta

Jun 2026

New Feature

Meta Keeps Delaying the Release of Its New AI Model to Developers

Google DeepMind

Jun 2026

New Feature

Microsoft releases search engine for use by ML agents

A

AI

Jun 2026

New Model

President signs executive order to give Gov. Early Access to AI Models

A

AI

Jun 2026

New Model

Nvidia Announces RTX Spark

A

Alibaba

Jun 2026

New Model

Alibaba DAMO Academy Releases GPU Version of Solver

May 2026

Anthropic

May 2026

New Feature

Claude Code Degraded Before Opus 4.8 Release

Google DeepMind

May 2026

New Feature

Claude, GPT, Gemini Agents Fail 72% of U.S. Healthcare Workflows

DeepSeek

May 2026

New Model

DeepSeek makes the V4 Pro price discount permanent

A

AI

May 2026

Open Source

Airbnb's Chesky Says US 'Misunderstanding' Use of Chinese Open-Source AI Models

Anthropic

May 2026

New Model

Claude doesn't know what time it is

A

AI

May 2026

Open Source

Models.dev: open-source database of AI model specs, pricing, and capabilities

A

AI

May 2026

New Feature

Launch HN: Runtime (YC P26) – Sandboxed coding agents for everyone on a team

Meta

May 2026

New Model

WebGPU support in llama.cpp

Google DeepMind

May 2026

New Model

Gemini 3.5 Flash Released

A

AI

May 2026

New Feature

Liquid AI releases fine-tuning harness for AI agents

A

AI

May 2026

New Feature

AWS releases Semantic Entropy for AI and parallel agents

A

AI

May 2026

New Model

Arena AI Model ELO History

A

AI

May 2026

Open Source

Company behind GLiNER model released open source model for running LLM guardrail

Anthropic

May 2026

New Feature

Using Claude Code: The unreasonable effectiveness of HTML

A

AI

May 2026

New Model

After banning foreign routers, FCC says existing ones can get updates until 2029

A

AI

May 2026

New Model

US Government releases first batch of UAP documents and videos

Anthropic

May 2026

New Feature

Kimi K2.6 just beat Claude, GPT-5.5, and Gemini in a coding challenge

April 2026

Meta

Apr 2026

Open Source

Llama 4 Scout & Maverick. Meta's multimodal MoE models

Meta released two new open-weight models built on a Mixture-of-Experts architecture with native multimodal support — text and images in a single model. Scout has a 10 million token context window (the largest of any open model) and runs on a single server GPU. Maverick handles 1 million tokens and benchmarks alongside GPT-4o and Gemini Flash. Both are free to download.

Why it matters

The 10M context window in Scout is a genuine breakthrough for processing long documents, entire codebases, or extended conversations. Free and open-weight makes it accessible to anyone.

Read guide

Google DeepMind

Apr 2026

Open Source

Gemma 4. Google's most capable open model yet

Google released Gemma 4 as a family of four open-weight models (2B, 4B, 26B MoE, 31B Dense) under an Apache 2.0 licence — free for personal and commercial use. The 31B model ranks near the top of the open-model leaderboard, outperforming models many times its size on maths, science, and coding benchmarks. The smaller models are designed to run on phones and low-cost hardware.

Why it matters

Gemma 4 is the most powerful free, runnable model at its size. The 31B punches well above its weight, and the tiny models bring capable AI to devices without internet access.

Anthropic

Apr 2026

New Model

Claude Opus 4.6 & Sonnet 4.6

Anthropic's latest generation raises the bar on reasoning, coding, and long-document work. Opus 4.6 sits at the top of most benchmarks for complex tasks. Sonnet 4.6 is the everyday workhorse. Faster and more affordable while remaining highly capable.

Why it matters

The best all-round models for most professional workflows right now.

Read guide

November 2025

Anthropic

Nov 2025

New Model

Claude Opus 4.5. Built for long autonomous work

Anthropic released Opus 4.5 with a focus on sustained, autonomous task completion — it can work independently for extended sessions across coding, research, and document analysis without losing context. It added native integrations with Chrome and Excel, letting it operate directly inside a browser or spreadsheet. Benchmark scores placed it at the top of the coding leaderboard with 96.1% on SWE-bench.

Why it matters

Knowledge workers could now hand Claude a complex spreadsheet or live web session and have it work through the whole thing — not just answer questions about it.

Read guide

September 2025

Anthropic

Sep 2025

New Model

Claude Sonnet 4.5. Agentic mid-tier model

Sonnet 4.5 was Anthropic's upgrade targeted at long-horizon agentic tasks — coding, workplace automation, and multi-step computer use. It can maintain reliable operation across 30+ step task chains and self-correct far more consistently than previous versions. Anthropic positioned it as the go-to model for developers building autonomous assistants.

Why it matters

Users building AI tools for real business workflows got a model that could finish multi-step jobs without getting stuck and asking for help at every turn.

Read guide

OpenAI

Sep 2025

Platform

Sora 2. AI video with audio and physics

OpenAI released Sora 2 with a standalone iOS app, adding synchronised audio — dialogue, sound effects, and ambient noise — for the first time. Physics consistency and motion quality improved dramatically over the original. A new 'Characters' feature allows a scanned real person to appear in any generated scene with accurate appearance and voice.

Why it matters

Sora 2 made short AI films with sound and realistic motion accessible from a phone, moving AI video from a novelty into a practical creative tool.

August 2025

OpenAI

Aug 2025

New Model

GPT-5. OpenAI's unified reasoning model

GPT-5 was OpenAI's first model trained natively on multiple modalities from the ground up, rather than bolting capabilities together. It integrated deep step-by-step reasoning directly into the main ChatGPT interface, automatically routing easy questions for speed and hard ones for deeper thinking. OpenAI made GPT-5 free for all ChatGPT users, with higher usage on paid plans.

Why it matters

For the first time, free ChatGPT users got genuine multi-step reasoning without having to switch between 'standard' and 'thinking' modes — it just works.

Read guide

May 2025

Anthropic

May 2025

New Model

Claude Opus 4 & Sonnet 4. Top coding benchmark

Anthropic released two new flagship models simultaneously. Claude Opus 4 debuted as the top-ranked coding model on SWE-bench at 72.5%, capable of sustained multi-hour autonomous coding sessions. Both models introduced extended thinking with tool use — Claude can now reason step-by-step while simultaneously browsing the web or running code.

Why it matters

Claude became capable of tackling full software projects with far fewer errors — behaving more like a tireless expert collaborator than a Q&A tool.

Read guide

Google DeepMind

May 2025

Platform

Veo 3. AI video with native synchronised audio

Announced at Google I/O 2025, Veo 3 became the first AI video model to generate synchronised audio alongside video natively — including character dialogue, sound effects, and ambient noise from a single text prompt. No separate audio step required. DeepMind described it as the end of the 'silent film era' for AI-generated video.

Why it matters

For creators, the jump from silent AI video to fully voiced, sound-designed clips from one prompt removes the biggest friction in AI video production.

Read guide

Google DeepMind

May 2025

New Model

Gemini 2.5 Flash. Fast reasoning at low cost

Also unveiled at Google I/O, Gemini 2.5 Flash is a compact, fast 'thinking' model that reasons step-by-step before responding — the same approach as Pro, at a fraction of the cost. It targets applications where speed and affordability matter more than maximum capability: coding assistants, chatbots, and high-volume API use.

Why it matters

Developers building AI-powered products got a capable reasoning model they could afford to run at scale, not just in demos.

Read guide

OpenAI

May 2025

New Feature

OpenAI Codex. Autonomous software engineering agent

OpenAI launched a new version of Codex as a fully autonomous cloud-based coding agent — distinct from the older code-completion product of the same name. Powered by a version of o3 optimised for software tasks, it runs in an isolated sandbox, writes full features, fixes bugs, runs tests, and opens pull requests without a developer watching. Available initially to paid ChatGPT subscribers.

Why it matters

Non-developers could describe a change in plain English and have an AI handle the full technical implementation end-to-end — not just suggest snippets.

Read guide

Moonshot AI

May 2025

Open Source

Kimi K2. Moonshot AI's trillion-parameter open model

Chinese AI lab Moonshot AI released Kimi K2, an open-source Mixture-of-Experts model with approximately 1 trillion total parameters and 32 billion active. It posted top scores on LiveCodeBench and SWE-bench, outperforming GPT-4.1 and matching Claude Sonnet on coding tasks. The weights are freely available to download and run locally.

Why it matters

The first open-source model to seriously challenge frontier coding performance. Free weights mean developers can run a top-tier coding model without API costs.

April 2025

Meta

Apr 2025

Open Source

Llama 4. Meta's multimodal open-source model

Meta released Llama 4 with native multimodal capabilities. It handles text, images, and documents. Competitive with GPT-4o on benchmarks. Free to download and run locally, making frontier-level AI accessible without API costs.

Why it matters

The most capable open-source model to date. Runnable on decent hardware via Ollama.

Read guide

OpenAI

Apr 2025

New Model

GPT-4.1. OpenAI's coding-focused release

OpenAI released GPT-4.1 with a focus on code generation, instruction-following, and long-context tasks. Outperforms GPT-4o on coding benchmarks and handles 1 million tokens of context.

Why it matters

Significant upgrade for developers using ChatGPT or the OpenAI API for code.

Read guide

March 2025

Google DeepMind

Mar 2025

New Model

Gemini 2.5 Pro. Google's reasoning model

Gemini 2.5 Pro topped multiple benchmarks on release with a 1-million-token context window and native reasoning mode. Particularly strong at maths, science, and multi-step analysis.

Why it matters

Competitive with o3 and Claude 3.7 on hard reasoning tasks. Available in Gemini Advanced.

Read guide

February 2025

xAI

Feb 2025

New Model

Grok 3. XAI's most powerful model yet

Elon Musk's xAI released Grok 3, trained on a 200,000 GPU cluster. Strong at reasoning and STEM. Includes a 'Think' mode for step-by-step reasoning. Integrated into X Premium+.

Why it matters

First Grok model to seriously compete with GPT-4o and Claude on general tasks.

Read guide

OpenAI

Feb 2025

New Model

GPT-4.5. OpenAI's emotionally intelligent model

OpenAI described GPT-4.5 as their most 'human' model. Better at nuanced conversation, emotional intelligence, and understanding context beyond the literal words. Less focused on raw reasoning than o1.

Why it matters

Best ChatGPT model for writing, coaching, and conversational tasks.

Read guide

Anthropic

Feb 2025

New Model

Claude 3.7 Sonnet. Extended Thinking

Claude 3.7 introduced 'Extended Thinking'. A reasoning mode where Claude works through problems step by step before responding, similar to OpenAI's o1. Dramatically improves performance on maths, coding, and complex analysis.

Why it matters

First Claude model with visible chain-of-thought reasoning. Major upgrade for hard problems.

Read guide

January 2025

DeepSeek

Jan 2025

Open Source

DeepSeek R1. Open-source reasoning model

Chinese lab DeepSeek released R1, an open-source reasoning model that matched OpenAI's o1 on benchmarks. At a fraction of the training cost. The weights are free to download. Shocked the AI industry and briefly crashed Nvidia's stock price.

Why it matters

Proved frontier reasoning is achievable open-source. Free to run via Ollama.

Read guide

December 2024

DeepSeek

Dec 2024

Open Source

DeepSeek V3. Challenges GPT-4o

DeepSeek released V3, a 671-billion parameter Mixture-of-Experts model trained for roughly $6 million. Compared to hundreds of millions for comparable Western models. Matched GPT-4o on most benchmarks.

Why it matters

The moment the AI world realised frontier models don't require frontier budgets.

Read guide

OpenAI

Dec 2024

Platform

Sora. OpenAI's video model goes public

After a year of limited previews, OpenAI launched Sora publicly for ChatGPT Plus and Pro subscribers. Generates up to 20-second cinematic video clips from text or image prompts. Includes a Storyboard mode for longer narratives.

Why it matters

First time frontier-quality AI video became available to mainstream users.

OpenAI

Dec 2024

New Model

o3 & o3-mini. OpenAI's best reasoning models

OpenAI's o3 set new records on the ARC-AGI benchmark. A test designed to be difficult for AI. o3-mini offers a faster, cheaper version of the same reasoning approach. Both use extended internal reasoning before responding.

Why it matters

The most capable reasoning models available, especially for maths and science.

Read guide

Google DeepMind

Dec 2024

New Model

Gemini 2.0 Flash. Fast and multimodal

Gemini 2.0 Flash launched as Google's fast, affordable frontier model with native multimodal output. It can generate text, images, and audio. Significant performance improvements over 1.5 Flash.

Why it matters

Best option for Gemini-powered apps that need speed and multimodal capability.

Read guide

November 2024

Anthropic

Nov 2024

New Model

Claude 3.5 Haiku. Fast and affordable

Anthropic released Claude 3.5 Haiku. Their fastest, most affordable model. Outperforms the previous Claude 3 Opus on most tasks at a fraction of the cost. Designed for high-volume applications where speed matters.

Why it matters

Best Claude model for tasks that need rapid iteration. Coding autocomplete, chat, search.

Read guide

October 2024

OpenAI

Oct 2024

New Model

o1 Pro. OpenAI's most powerful reasoning model

OpenAI expanded the o1 family with o1 Pro. Using more compute at inference to tackle harder problems. Available to ChatGPT Pro subscribers. Particularly strong at frontier-level maths, science, and complex coding challenges.

Why it matters

The first model to pass professional-level qualifying exams across multiple domains.

Read guide

AI releases, explained.

Please don't discontinue Gemini 2.5 Flash

China may restrict foreign access to Chinese open-source AI models

New serious vulnerabilities spiked around release of Claude Mythos Preview

Launch HN: Context.dev (YC S26) – API to get structured data from any website

General-purpose large language models outperform specialized clinical AI

Coding is solved: one guy reverse-engineered Claude Desktop for Linux via Claude

If Claude Fable stops helping you, you'll never know

Anthropic releases Claude Fable 5

GPT-2: Too Dangerous To Release (2019)

Launch HN: Intuned (YC S22) – Build and run reliable browser automations as code

Anthropic, please ship an official Claude Desktop for Linux

Meta Keeps Delaying the Release of Its New AI Model to Developers

Microsoft releases search engine for use by ML agents

President signs executive order to give Gov. Early Access to AI Models

Nvidia Announces RTX Spark

Alibaba DAMO Academy Releases GPU Version of Solver

Claude Code Degraded Before Opus 4.8 Release

Claude, GPT, Gemini Agents Fail 72% of U.S. Healthcare Workflows

DeepSeek makes the V4 Pro price discount permanent

Airbnb's Chesky Says US 'Misunderstanding' Use of Chinese Open-Source AI Models

Claude doesn't know what time it is

Models.dev: open-source database of AI model specs, pricing, and capabilities

Launch HN: Runtime (YC P26) – Sandboxed coding agents for everyone on a team

WebGPU support in llama.cpp

Gemini 3.5 Flash Released

Liquid AI releases fine-tuning harness for AI agents

AWS releases Semantic Entropy for AI and parallel agents

Arena AI Model ELO History

Company behind GLiNER model released open source model for running LLM guardrail

Using Claude Code: The unreasonable effectiveness of HTML

After banning foreign routers, FCC says existing ones can get updates until 2029

US Government releases first batch of UAP documents and videos

Kimi K2.6 just beat Claude, GPT-5.5, and Gemini in a coding challenge

Llama 4 Scout & Maverick. Meta's multimodal MoE models

Gemma 4. Google's most capable open model yet

Claude Opus 4.6 & Sonnet 4.6

Claude Opus 4.5. Built for long autonomous work

Claude Sonnet 4.5. Agentic mid-tier model

Sora 2. AI video with audio and physics

GPT-5. OpenAI's unified reasoning model

Claude Opus 4 & Sonnet 4. Top coding benchmark

Veo 3. AI video with native synchronised audio

Gemini 2.5 Flash. Fast reasoning at low cost

OpenAI Codex. Autonomous software engineering agent

Kimi K2. Moonshot AI's trillion-parameter open model

Llama 4. Meta's multimodal open-source model

GPT-4.1. OpenAI's coding-focused release

Gemini 2.5 Pro. Google's reasoning model

Grok 3. XAI's most powerful model yet

GPT-4.5. OpenAI's emotionally intelligent model

Claude 3.7 Sonnet. Extended Thinking

DeepSeek R1. Open-source reasoning model

DeepSeek V3. Challenges GPT-4o

Sora. OpenAI's video model goes public

o3 & o3-mini. OpenAI's best reasoning models

Gemini 2.0 Flash. Fast and multimodal

Claude 3.5 Haiku. Fast and affordable

o1 Pro. OpenAI's most powerful reasoning model