Towards a Framework for AI Agent Design, Part 3
In Part 1 of this series on a Framework for AI Agent Design, we looked at a life cycle planning model for your agentic projects. Part 2 looked at the different organizational elements that you have to understand and orchestrate in order to create successful AI Agent projects.
"Ok," you might say. "I'm ready to get some work done. How do I design AI agents to do the work?"
This is where the rubber meets the road. Since this is a framework discussion, we won't be going into the individual steps that you design and build as you create your AI agent in your platform of choice. For example, there is no right answer to questions like "How many steps should I have in my email response agent workflow?"
But what is useful is to think about what the goal is in developing your agentic AI solution. There are three sub-frameworks that can help guide you as you build your AI agents.
Outcomes are Great…
But You Have to Break Them Down into Deliverables and Tasks to Get There
Every company has a big collective goal they're trying to achieve on some time frame. There is some outcome they are hoping to bring about. Usually this is done quarterly in order to remain focused and"do-able". Many people, especially in technology companies, adopt the popular OKR (Objectives and Key Results) framework to assist in this exact kind of bigger picture planning and performance.
Wouldn't it be great if we could spin up an AI Agent that would just do all our OKRs for us?
Not going to happen.
Not for a while, anyway. But we can be sure of one thing: When this Mega AI Agent arrives, it will be composed of dozens, hundreds, maybe thousands of small, discrete AI Agents performing specific workflows. These will be coordinated to achieve the higher order outcomes.
So this gives us our place to start in our AI Agent planning: Not with the largest outcome, but with the smallest.
In other words, AI Agent builders need to carry out a research and analysis of the domain, the department, and the workers they seek to help. The analysis consists of identifying their workflows and deliverables, and then breaking them down into their smaller and smaller component parts.
A good model of this breakdown comes from basic physical science: What is a thing made up of? We decompose it like this:
1. An object
2. Materials and aggregates
3. Compounds
4. Elements
5. Atoms
Fortunately, in the world of work, we have a similar breakdown structure. This is our guide in analyzing the workflows we want to automate. It goes from large to small, from long-term to short-term, from slow to fast.
The Taxonomy of Work
Here is the structure you can follow in your workflow analysis. Start at the top and break it down into its component parts. Keep going down through the stack. Sometimes these components are workflows. Often they are some form of documentation. They both work in figuring out the relationship of smaller steps rolling up to the bigger task.
Outcome or Objective from your OKR
The big achievement that requires a department of people and projects to achieve
Projects
The big efforts carried out by a team
Documents / Deliverables
The information products, reports, and status/communication messages that mark progress and guide the team
Pro tip: Think broadly. Everything is a deliverable. It can be a formal report. It can also be a regular status Slack message.
Tasks
The activity an individual carries out, usually contributing to a deliverable, or being tracked by one
What we usually call "the daily work" for a specific job
Triggers / Inputs / Actions
The smallest steps or micro-tasks that make up a task
These are usually not thought about much. But they are atomic foundation of a workflow. This is what we want to pull out and work on.
When we break work down into tasks and inputs/actions, we find we have defined:
clearly scoped activities
with known inputs and outputs
expected behaviors
definitions of good and bad
a definition of done
These are ideal workflows for AI Agents.
In other words, when you start to implement AI Agents in your work, start at the bottom of the stack, and build those first. This is where you will get your fastest, earliest efficiencies.
Agents Transform Data and Documents from One State to Another
When designing an individual AI Agent, it's useful to think of the agent as sitting in the middle of a process or structure that has three parts. It's shaped like a barbell:
Starting data set or document, the inputs at rest
The AI Agent, the transformer, the machine with moving parts
Ending data set or document, the outputs at rest
The AI Agent's job is to change the data from one state to another. The output is a more valuable state. It's the thing we want, so we can take it and do more work with it.
So what kinds of data transformations can AI Agents do efficiently?
Organize and categorize
Aggregate
Analyze
Synthesize
Standardize
Summarize
Expand and enrich
Check and validate
Group and ungroup
Find patterns
Find missing pieces
When we break down at tasks this way, we have a formula that's not complicated, but can be mixed and matched to define almost any AI Agent we need at the task level.
So send your AI Agents to the work gym and have them start lifting barbells.
Design "Instructional Pressures" on the Agent to Keep It on Task
A third design pattern for creating effective AI Agents is to borrow a concept from instructional design: Corrective feedback.
When students learn, they are going through a process just as described above. How do we help them learn a thing correctly? This is where comprehension checks, tests, demonstration activities, questioning, and coaching and teaching come in to play. If a student is going off into misunderstanding or in the wrong direction, feedback applies instructional pressure to keep them on the right learning path. Feedback helps them get to the desired outcome: They learned the thing. They did the work.
We want to design checkpoints in our agentic workflows where there is the same opportunity for checking and corrective feedback if necessary.
Algorithmic Checkpoints
We can program algorithmic error checking into the AI Agent itself. That is, we can add logic to the agent where it will check its own progress and take corrective action. This is becoming easier and more robust with the latest LLMs that have advanced reasoning capabilities.
We see this when we submit a question and watch the agent play back the work planning steps it is creating for itself as it works towards the final answer. Some examples of built in algorithmic correction are asking the agent to compare its work to an example, or to reason backwards from an outcome, or to generate multiple options for itself, and choose the "best" one.
Human in the Loop
The second way to get corrective feedback to an AI Agent is to build in checkpoints for human oversight. This is HITL: Human in the loop. These are points in the workflow where the Agent pauses and a human has the chance to review, catch mistakes, and offer correction, and ultimately approval. For long or more complicated agentic workflows, these human in the loop moments are scheduled happen every time the agent runs. For example, a worker may make an AI Agent that prepares a draft document for review, and will not send out the document until a human has given it approval. Another pattern to use in AI Agent design is to define escalation scenarios. That is, define a knowledge, skill or confidence threshold in the Agent's instructions. When the Agent self-assesses that it has hit that limit, it stops and seeks human input. Both of these are examples of what I call "instruction presssure" that will help keep your AI Agents from making critical mistakes or hallucinations.
So there you have it: Three sub-frameworks for designing individual AI Agents:
Break the work down into its smallest tasks and actions, and start building your army of agents there.
Design your AI agent's work to consist of taking data and documents and changing them into new, more valuable forms
Build in ways for corrective feedback to the agent to be part of the standard logic and workflow so the agent can do the work, and not you.
Combined with Framework Part 1, the life cycle planning model and Framework Part 2, the component and stakeholder model, and this is a draft of a complete framework for guiding AI agent design at the high level.
What do you think? Do you agree? Disagree?
Comments and feedback welcome, and thanks for reading.
https://get.mindstudio.ai/t3p8e2ypz6tp