Blog 1.4 AI Agent Basics: Building Blocks Part 2
This blog focuses on execution, how agents connect to tools and data, how workflows are structured, and how testing, monitoring, and refinement ensure ongoing reliability.
Step Four: Connect Tools and Data
The reach and capabilities of the agentic AI system is directly related to the data and the tools the system can reach. Does the agentic system need access to APIs, databases, cloud storage, or ticketing systems? Will it need to integrate with SaaS platforms or repositories? While an LLM can’t have been trained on what the temperature is right now at the top of Mount Washington, in a connected agentic system, it can call a weather API to get that information at any time.
Secrets management requires special consideration in agentic systems. Static API keys are a poor fit because they undermine the principles of zero trust and least privilege when applied to autonomous AI agents. For example, if multiple people rely on the same agentic system, the agent would need broad privileges covering all possible users, which breaks the principle of least privilege.
The challenge deepens with hybrid agent access. An agent might act with delegated human authority to manage email or calendar functions, while also holding its own non-human identity credentials to access systems the human cannot, such as a sensitive database. Managing these blended rights without leaking or overextending access is a shift in secrets management toward dynamic access control models that grant scoped permissions based on the specific task, user, and time.
Connection protocols like Model Context Protocol (MCP) or agent-to-agent (A2A) communication are also important here to ensure your agent can interact easily with other systems without brittle custom connectors.
Step Five: Workflow Considerations
It is tempting to imagine that agentic AI can handle every task in a workflow on its own, but in practice these systems still need structure and sometimes a human in the loop. Agentic workflows are more adaptive than traditional automation, but still need high-level guardrails even while dynamically adjusting to what the situation requires.
Consider an airline’s customer service process. In a traditional automated workflow, the system follows predetermined paths. If you want to check your departure time or change your entire itinerary, it can handle that because those cases are hard coded. But if you ask for something unusual, like changing only one leg of a multi-leg journey, the automation stalls and you’re left repeating “representative” trying to reach a human agent.
With an agentic workflow, the objective remains the same, solve the customer’s problem, but the approach changes. The agent can interpret the nuance of “change just one leg,” reason through the request, and take the steps needed to access the booking system. In many cases it can complete the work end-to-end without help. Yet when the request bumps into something outside its authority, perhaps a change that triggers a fare rule override, the agent can bring a human into the loop at exactly the right moment rather than escalating everything by default.
The final safeguard is verification and validation. Even when the agent acts independently, it should be designed to check its work. For example, before closing a support ticket it might validate the transaction against a checklist or compliance rule set.
Step 6: Test, Monitor, Refine
Every agent should be tested in a safe sandbox before it touches production data. Monitor telemetry from logs and human-in-the-loop testing. Look for both functional errors and more subtle signs of misalignment, such as the agent gradually “drifting” from its intended task.
Red teaming is invaluable here because it tests agentic AI systems under the same pressure they will face in the real world, assuming the role of an adversary. to uncover failure points and exploit weaknesses. In an agentic AI context, that might mean injecting prompts designed to override instructions, chaining functions together to escalate privileges, or feeding malicious data from an external source. In future blogs we’ll do some deep dives into agentic AI threat modeling guidance from OWASP and CSA.
Don’t forget that testing doesn’t end at launch. These are living systems. Monitor, measure, refine. Runtime monitoring, maintenance, and optimization should be continuous.
By defining your goals, putting guardrails in place, picking the right tools, wiring up the right data, and committing to ongoing testing, you set yourself up for reliability.