Overview of AI Agents
This blog provides a comprehensive overview of AI agents, exploring their structure, capabilities, and real-world applications. It highlights the differences between traditional agents and large language models (LLMs), examining the strengths and limitations of each approach. The post compares a range of frameworks and tools used to build and orchestrate agentic systems, including LangChain, LangGraph, CrewAI, OpenAI’s SWARM, and AutoGen. These tools are evaluated in terms of their design philosophies, flexibility, and suitability for various tasks. By walking through practical examples and contrasting architectures, the blog offers insight into how these technologies can be applied to create intelligent, modular, and coordinated multi-agent workflows.
Table of Contents
Introduction to AI Agents¶
Applications of AI Agents¶
AI Agents: Autonomous Systems That Act on Behalf of Users¶
AI agents are (semi-) autonomous systems that interact with their environments, make decisions, and perform tasks on behalf of users. These systems are designed to operate independently or with minimal human intervention and can adapt their behavior over time.
Key Characteristics:
- Autonomy: Capable of carrying out tasks without continuous human oversight.
- Decision-Making: Analyze data to make informed choices and select appropriate actions.
- Adaptation: (Ideally) learn from experience and feedback to improve performance over time.
Agent vs. Large Language Models¶
Agents
- Autonomous systems capable of performing tasks and making decisions based on their environment.
- Often integrate multiple components such as sensors, memory, reasoning, and planning.
- Can use Large Language Models (LLMs) as one part of their decision-making or communication abilities.
- Goal-oriented: they act to achieve specific objectives or goals.
Large Language Models (LLMs)
- AI models trained on large amounts of text data to understand and generate human-like language.
- Not autonomous by themselves—they require prompts or input to generate responses.
- Serve as the "engine" for language understanding and generation in many applications (e.g., chatbots, summarization tools, code generation).
- Can be embedded inside agents to provide language-based reasoning or communication.
Retrieved from https://www.datacamp.com/fr/tutorial/crew-ai
ChatGPT is an agent built on top of an OpenAI large language model (such as GPT-4o). As an agent, it can switch between different LLM versions (like GPT-3, GPT-3.5, or GPT-4) depending on the task or configuration.
For the most part, agents are structured prompts built on top of LLMs, specifically designed to accomplish a task or goal. They often use tools, follow predefined rules, and operate within a given context, description, or backstory.
Here's a summary of the four main components of an agent with small example:
Sure! Here’s a bullet-point summary with a small example for each component:
Task/Goal
- ➤ The agent’s main objective is to find and book a flight.
- Example: "Book a flight from Toronto to New York for next Monday."
Tools
- ➤ The agent uses the API to look up available flights and schedule it.
- Example: Access to a flight search API or calendar tool.
Rules / Descriptions / Backstories
- ➤ Shapes how the agent interacts and what options it presents.
- Example: "You are a polite travel assistant that only recommends economy class flights."
Prompt / Prompt Template
- ➤ This structured prompt guides the LLM’s behavior and decisions.
- Example:
You are a travel assistant. Your goal is to help users book flights. Use the flight search tool when needed. Always confirm details with the user before proceeding.
🔍 Agents Beyond ChatGPT¶
Although ChatGPT is currently one of the most well-known and powerful agents, it's not the only one. Many other platforms have developed their own agents, often tailored to specific use cases, industries, or ecosystems.
🤖 1. Replit Agent¶
Replit uses its own homegrown language model (not OpenAI's GPT) and tools to power an agent designed for software development and coding assistance.
Key Features:
- Integrated coding environment
- Code generation and debugging
- Context-aware based on the developer's current files or project
Encapsulated Ecosystem:
Everything—the LLM, tools, and task-specific logic—is tightly integrated within Replit’s own platform.🖼️ Image Suggestion:
- A diagram showing Replit with:
- a coding IDE
- an internal LLM
- dev tools like terminals, file explorer, and test runners
- A diagram showing Replit with:
🧑‍⚖️ 2. Thomson Reuters CoCounsel¶
CoCounsel is a legal AI agent by Thomson Reuters, likely powered by a language model (possibly their own or a customized version) and heavily integrated with Westlaw legal rules and databases.
Key Features:
- Legal research and summarization
- Drafting legal documents
- Ensures responses follow legal guidelines, citations, and jurisdictional rules
Domain-Specific Agent:
Built to operate within the legal domain with deep integration into legal knowledge bases.
LangChain¶
LangChain is the most popular AI agent frameworks, one of the fastest-growing open-source projects. Key features that make LangChain powerful include its ability to connect data to language models (such as OpenAI’s GPT via API) and create agent workflows.
Why does LangChain exist? 🤔
The landscape of language models is still evolving, and developers face challenges due to a lack of sufficient tooling for production-level deployments. LangChain addresses these gaps by offering a model-agnostic toolkit, enabling developers to experiment with multiple LLMs and identify the best fit for their needs—all within a unified interface, avoiding the need for extensive codebase scaling as more providers are integrated.
Agents in LangChain 🤖
A popular concept in the LLM space is the use of agents—programmatic entities capable of executing goals and tasks. LangChain simplifies agent creation using its agents API. Developers can leverage OpenAI functions and other task execution tools, allowing agents to act autonomously. LangChain stands out by providing access to multiple tools within a single interface. The "plan and execute" functionality enables agents to autonomously set goals, plan, and perform tasks with minimal human input. Though current models struggle with long-term autonomy, these capabilities will improve over time.
Memory with Language Models đź§
One challenge with LLMs, such as OpenAI’s API, is that they are stateless. Every new request requires sending back the necessary context to generate a response. While developers can manage this by saving message histories in Python lists or text files, this approach doesn't scale efficiently. LangChain helps address this limitation.
How it works
It takes a document and transform it into VectorStore (stores the chunks of data)
VectorStore holds embeddings - Vector representation of the text. The reason of Embeddings is that we can easily do search where we look for pieces of text that are most similar in the vector space.
- How Vector Store works
Image retrieved from www.langchain.com
LangGraph¶
One of the main reasons people are drawn to LangChain isn’t just because of its agents, but because of its emphasis on reproducibility, standardization, and observability in working with LLMs. LangChain provides a structured way to build with language models, making it easier to track, debug, and visualize workflows.
A concept that keeps coming up in this space is the idea of a workflow. When we use LangGraph to build an agent workflow, we’re essentially creating a visual and traceable graph of how the agent operates. This means we can observe each step the agent takes in real-time—what decisions it makes, what tools it uses, and how it moves from one state to another.
LangGraph is a powerful library designed for building stateful, multi-actor applications powered by LLMs.
- Stateful means the system has built-in memory—so as the agent performs tasks (like step A, B, and C), it remembers previous actions and uses that context moving forward.
đź”§ Other Key Components of LangGraph:
- Building stateful, multi-actor LLM applications
- Features like human-in-the-loop interactions
- Deep integration with LangChain as the underlying framework
- Built-in observability and traceability for debugging workflows
CrewAI¶
👥 CrewAI: Collaborative Role-Based Agent Workflows
- CrewAI is built around the concept of collaborative, role-based AI agents—hence the name “Crew,” emphasizing teamwork.
- The core idea is that multiple agents, each assigned a specific role, can work together like a team to tackle complex tasks.
- Unlike single-agent frameworks, CrewAI enables inter-agent communication and dynamic task delegation, allowing tasks to be split and passed between agents intelligently.
🔑 Key Features of CrewAI
Role-Based Architecture:
Each agent is assigned a distinct role (e.g., researcher, planner, writer) tailored to the overall task.Multi-Agent Orchestration:
Coordinates the execution of multiple agents working collaboratively.Dynamic Task Delegation:
Tasks can be passed between agents based on their role or expertise—no need for one agent to do everything.Hierarchical & Sequential Workflows:
Supports both high-level "manager" agents and low-level "worker" agents in a structured task flow.
Let me know if you want visuals or examples of use cases like "content generation teams" or "research + summarization workflows"!
Retrieved image from https://ai.plainenglish.io/mastering-crewai-chapter-1-your-first-intelligent-workflow-06b58aed1b3d
đź§ ChatGPT vs. CrewAI: Single vs. Multi-Agent Systems
- ChatGPT operates as a single agent, handling tasks sequentially from start to finish on its own.
- While single-agent systems like ChatGPT are powerful, CrewAI explores what happens when multiple specialized agents work together.
🤝 CrewAI’s Multi-Agent Approach
Role-Based Design
- Each agent is assigned a specific responsibility, such as research, analysis, writing, or summarization.
- This mirrors how teams operate in the real world, with each member focusing on their strength.
Autonomous Inter-Agent Delegation
- Agents can dynamically delegate tasks to one another based on the needs of the workflow.
- This enables a more flexible and scalable approach to complex tasks.
OpenAI Swarm¶
OpenAI Swarm is an experimental, open-source project by OpenAI that explores how multiple AI agents can collaborate to solve complex tasks. Each agent can take on a specific role or subtask, coordinating like a "swarm" to reach a common goal.
Although still in its early stages, OpenAI Swarm reflects a growing interest in multi-agent systems—a shift from single-agent AI (like ChatGPT) to collaborative, role-based frameworks, similar in spirit to tools like CrewAI.
đź§ Core Concepts & Philosophy
Multi-Agent Collaboration
- The idea is that, if we can master effective collaboration between agents, we may either:
- Solve problems that a single agent cannot, or
- Determine when a single agent is sufficient—and simplify accordingly.
- OpenAI believes that figuring out robust multi-agent interactions could have the greatest impact on the future of agent frameworks.
- The idea is that, if we can master effective collaboration between agents, we may either:
Lightweight Framework
- Swarm is designed to orchestrate conversations and workflows between agents.
- For example: Agent A can pass a task to Agent B, which can then hand it back or forward it further—creating dynamic agent interactions.
Stateless by Design (for Now)
- In its current form, Swarm agents are stateless between calls—they don’t retain memory of past interactions.
- This makes it lightweight but also highlights a future opportunity for incorporating persistent state or memory.
AutoGen by Microsoft¶
AutoGen is a powerful, open-source framework developed by Microsoft, designed specifically for building multi-agent systems that are production-ready. What sets AutoGen apart is its focus on asynchronous communication, distributed deployment, and cross-language integration, making it ideal for complex, real-world enterprise applications.
🚀 Key Features of AutoGen
Enterprise-Grade & Production-Ready
- Built with scalability and reliability in mind, AutoGen is designed to support real-world deployments across distributed environments.
- It's suitable for enterprises that require robust agent interactions under real-time, asynchronous conditions.
Asynchronous Communication
- Unlike frameworks that assume synchronous, linear conversations, AutoGen supports asynchronous messaging between agents.
- This enables agents to operate independently, respond when ready, and collaborate across time and tasks—ideal for long-running or parallel workflows.
Distributed Architecture
- Agents can be deployed across different machines, containers, or services, enabling large-scale distributed systems.
- This architecture makes AutoGen especially suitable for cloud-native, microservice-based environments.
Cross-Language Integration
- AutoGen supports multi-language ecosystems, allowing agents written in different languages (e.g., Python, C#, .NET) to interoperate within the same workflow.
- This is particularly useful for enterprises with mixed tech stacks and legacy systems.
Flexible Agent Roles & Composition
- Developers can create agents with specific roles (e.g., planner, executor, validator) and compose them into custom workflows tailored to a business’s needs.
đź§ Use Cases
- Automated report generation where different agents handle data extraction, summarization, and formatting
- Multi-stage customer support bots, where agents handle categorization, resolution, and escalation
- Workflow orchestration in finance, legal, or software development pipelines