AI Systems, Not Agents

An overview of what AI systems actually are

AI Systems, Not Agents

I define how we understand AI systems (not agents) at Clous. Álvaro Villalba Pérez Dec 06, 2024

When we first started researching AI architectures, we called them Multi-LLM Architectures. But as we kept going, it became clear that AI systems are more complex. They aren't just agents. They are something broader and more nuanced. To explain that better, let’s start with the concept of an agent.

An agent is basically a mix of Large Language Models (LLMs), each with different levels of autonomy. Think of it as a collection of AI components, each doing its own job, but with some level of independence. This makes them powerful because they can collaborate and handle tasks that need decision-making without constant human involvement.

Thanks for reading Essays by Alvaro! Subscribe for free to receive new posts and support my work.

But there's a common misconception. Many think an AI system is simply a bunch of agents working together. That’s not quite right. There are many ways to use AI, and an AI system goes beyond just agents.

The Real Scope of AI Systems

An AI system can be broken down into a few major functions:

Autonomy: This is when AI handles complex tasks without needing humans to step in. The system acts independently, especially when the task is complicated or demands deep intelligence.

Instant Processing: Sometimes, we need answers right now. AI can jump in and provide a response instantly, especially in situations where speed is crucial.

Asynchronous Processing: This is where things get interesting. Sometimes, data comes in pieces—a bit today, a bit tomorrow. The AI works on it all at a specific moment, processing and making sense of different types of data that come at different times. Think of it like retrieving insights from daily reports that trickle in from multiple sources.

The levels of intelligence needed for each of these are decided by the complexity of the task and the amount of data transformation required. If you’re condensing a 46-page document into a paragraph, that's a lot of transformation. AI systems are built to handle that complexity.

A key part of defining AI systems is understanding that data flows fluidly across the architecture. Data isn't static. It moves through different components at different times, depending on the needs of the user experience (UX). It's like a river, feeding into multiple branches, each leading to a different outcome depending on the timing and type of input.

The Age of Intelligence

We're in a new age of AI development. Here’s what that means:

Young Model: Imagine a child learning. Early AI models don’t have detailed instructions—they do simple tasks like quick summaries.

Function/Tool Calling: As the AI matures, it starts using tools and making decisions about when to use them. It's optional, but powerful.

Task Outline: This is where AI starts planning. It takes a task, outlines what needs to be done, and executes it step by step.

Cognitive Workload

Another piece to understand is Cognitive Workload. This is the amount of effort (or intelligence) needed to create the best possible output. A simple task like calculating 2+2 requires very little cognitive workload. But something like predicting the impact of a new company policy requires much more.

NL Gap Closing and the One-Constraint Thesis

We also talk about NL Gap Closing, which is all about making a foundational model more responsive to human needs. It’s part of what we call the One-Constraint Thesis. This idea says that if you focus on just one clear limitation, the AI becomes much better at that specific task. Instead of making a general model, you make something specialized. Google’s chatbot, for example, works extremely well for search because it’s optimized for that exact use case.

AI Architectures: It’s Not Just About Models

Within the broader AI system, we have different types of architectures:

Multi-LLM Architectures: This is what we started with, where multiple language models work together.

Compounded Systems: This is when we combine different models and methods to get a more nuanced result.

AI Systems: A larger ecosystem where everything—the LLMs, the data, the functions—works together as one integrated whole.

Model and Weight Governance

When we talk about Model Governance, we mean managing computing resources effectively. Just like data governance, we make sure that no resources are wasted and everything is used as efficiently as possible.

Weight Governance, on the other hand, is about how we distribute decision-making authority between different models. In complex AI systems, we need balance—each model needs to do its part without overstepping. This helps keep everything in sync, enhancing user experience while minimizing resource wastage.

There’s also a practice we call Weight References. When fine-tuning models, we monitor the weight each model carries, ensuring that the decision-making is balanced and efficient.

Agentic Workflows

Now, let’s dive into Agentic Workflows. These are workflows managed by agents, and they can have multiple layers:

Initial Layer: This is where we start, often breaking down tasks or adding necessary knowledge.

Main Layer: The core content creation or main response comes from this layer.

Secondary and Further Layers: These layers assist the main agent, refining or adding context.

There’s a concept we use called Instruction Forwarding. Basically, an instruction is passed along through the agents to make sure each one knows what it’s supposed to do. This keeps everything aligned.

We also have Context Forward-Propagation, which means making sure that context (like reasoning or assumptions) is carried from one model call to the next. It’s like handing over a baton in a relay race.

Agentic Chaining and Context Partitioning

We often chain models together. This is called Agentic Chaining and we have our own protocol—AAP (Agentic Architecture Protocol)—that allows agents to work in parallel or series, depending on what’s needed.

Sometimes, we need to split the context into parts to manage it better. This is called Continuous Context Partitioning. It lets us break things down and reassemble them as needed, especially in complex tasks.

Beyond Foundational Models

Finally, sometimes we have to go against the foundational model. This might mean using Literal Reasoning or Subsequent Reasoning to fill in gaps. If the foundational model doesn’t give a perfect answer, subsequent reasoning often doesn’t help much, but literal chains of thought might.

Another technique we use is Weight Cleanse during Continuous Training, using Reinforcement Learning from Human Feedback (RHLF). This helps us adjust model weights in an ongoing way, keeping accuracy high and resource use low.

Wrapping Up

AI systems are much more than just a group of agents. They’re a complex network of neural networks (somehow ironic). Architectures, workflows, governance, and intelligence—all working together. Data flows through it all, shifting and changing to meet user needs in real-time. Autonomy, instant processing, and asynchronous capabilities are all key aspects, each playing a role depending on the situation. By understanding these building blocks, we can build AI systems that are smarter, faster, and more efficient than ever before.

Future topics I’ll explore

Data Condensation for Production-Level Production Readiness

Current and future impact of AI in the labor market

Context is The Way to Learn