Agentic AI

created: Sun, 12 Oct 2025 19:45:23 GMT, modified: Mon, 22 Dec 2025 03:36:13 GMT

State of AI Report 2025

Model Economics and Architecture

Capability per dollar doubles in 3-6 months (5x faster than Moore's law)
Model routing is a competitive advantage
- Smaller/dumber models to solve simple/specific tasks
- Bigger/smarter models for more complex tasks
Model release roadmap is tied to fundraising
Multi-model architecture

Browser as AI Operating System

Browser is an AI operating system by default to which agents are plugged in
Answering engines based on search
- Purchase intent with high conversion rate
- Dependency on search engine(s): e.g., Google

Compute Infrastructure

Datacenters, sovereignty, power consumption/constraint per token
- Scaling demand for tokens, to which models
- Hardware constraints are impacting model availability and evolution
  - Anthropic and OpenAI outages
- Custom chips and hardware makers

Model Evolution and Measurement

How we measure AI success, intelligence and reasoning gains
- e.g., Claude performance degradation in Aug, 2025
Models bigger, smarter, thinker
Closed/open models
- China is the leader in open weight models
  - Moonshot AI built a 1T-param MoE with 32B active trained using MuonClip
- Concentration of talents in ecosystem
  - Qwen models are used as a base for derivative work and fine-tuning
- Hybrid ecosystem (closed frontier models/open models for compliance/volume)

AI Sovereignty

Sovereignty in AI
Model access is evenly distributed, but very few are gaining from it

Emerging Paradigms

World models
Scaling paradigm shift from static pre-training to dynamic, on-the-fly adaptation
- Continuous learning
Superhuman AI systems could become "teachers" rather than just "tools"
- Extracted novel chess techniques and teach human grandmasters
AI is moving from answering questions to generating, testing, and validating new scientific knowledge
- AlphaEvolve: a coding agent for algorithm discovery and engineering impact
Computer Use Agents (CUA) have improved by leaps and bounds, and still fall short
Usage costs: Some users are costing upwards of $50k/month for a single seat of Claude Code

Agile in the Age of Agentic Engineering

Mon, 22 Dec 2025 03:22:16 GMT

The transition from manual coding to autonomous software engineering agentic systems necessitates a return to rigorous foundational practices.

Career

Mon, 22 Dec 2025 03:03:54 GMT

to me seeng job posts looks already like a blast from the past

Declarative agents

Mon, 22 Dec 2025 03:03:54 GMT

AI-assisted software development

Mon, 22 Dec 2025 03:03:54 GMT

Recently we've developed a software project using our agentic problem solving framework, one of a kind.

The Rising Tide

Mon, 22 Dec 2025 03:03:54 GMT

LLMs set a baseline for what's possible. Builders wrap products around them, squeezing out maybe 20% more through clever prompting, RAG pipelines, or fine-tuning. No amount of engineering tricks makes it fundamentally better -- you can't turn a 3.5 into a 4. But you can apply it in new domains, find novel use cases, build workflows that weren't possible before.

Agentic scaling

Mon, 22 Dec 2025 03:03:54 GMT

We need to find, enable and grow people who wants to work differently, both internally and by hiring.

Self-improving agents

Mon, 22 Dec 2025 03:03:54 GMT

Sources

State of AI

Sun, 23 Nov 2025 18:25:10 GMT

One million steps

Sun, 23 Nov 2025 18:25:10 GMT

Getting an AI agent to make a million sequential decisions without errors sounds impossible. The context window fills up, details get lost, and small mistakes compound into catastrophic failures.

Context Engineering

Mon, 27 Oct 2025 07:13:53 GMT

Every LLM interaction builds up the same way: system prompt, user message, assistant response, tool calls, more user input, more responses, and all of this accumulates into context, which grows with every turn.

RAG

Sun, 19 Oct 2025 22:38:43 GMT

RAG papers and tools.

Content

Sun, 19 Oct 2025 21:50:05 GMT

In the beginngin it was text.

Future

Sun, 19 Oct 2025 21:50:05 GMT

How the future might look like.

Problems

Sun, 19 Oct 2025 21:50:05 GMT

Open problems.

AI Math

Sun, 12 Oct 2025 21:38:10 GMT

AI use in Mathematics, using LLM with theorem provers.

Predictions

Sun, 12 Oct 2025 21:38:10 GMT

Predictions for the year 2026.

Coding agents

Sun, 12 Oct 2025 19:45:23 GMT

A list of CLI coding agents.

Papers

Sun, 12 Oct 2025 19:45:23 GMT

Collection of AI papers.