Task-oriented Coordination Requirements for AI Agent Protocols

Introduction

Purpose With the rapid advancements of AI technologies and their applications, AI agents utilizing Large Language Models (LLMs) have emerged as a pivotal direction in global technological evolution and market development. The single-agent systems exhibit inherent limitations when addressing complex tasks in dynamic environments, the efficient multi-agent collaboration for complex task completion has garnered increasing attention, wherein task-oriented coordination constitutes a critical component of standardized multi-agent systems. This document examines the requirements for standardizing AI Agent protocols to support task coordination in multi-agent systems.

Terminology

Task:: ISO/IEC 22989, task is actions required to achieve a specific goal. These actions can be physical or cognitive. For instance, computing or creation of predictions, translations, synthetic data or artefacts or navigating through a physical space.
Shared Message Pool:: A pool where agents publish structured messages and subscribe to relevant messages based on their profiles.
Coordinator Agent:: An agent that receives tasks and decomposes or distributes tasks to other agents.
Execution Agent:: An agent responsible for executing tasks distributed by the Coordinator Agent.
Normative Language:: The key words "MUST", "REQUIRED", and "SHOULD" in this document are to be interpreted as described in .

Task Coordination Framework

| Agent X | |2.Task1 distributed +---------+ | | | | | | +---------+ 1.Task submitted +---------+ 3.Task1 completed| | Task |------------------->| |<-----------------+ | Invoker |<------------------ | Agent A |<-----------------+ +---------+ 4.Task completed +---------+ 3.Task2 completed| | | | | | | |2.Task2 distributed +---------+ +------------------->| Agent Y | +---------+ ]]> The system operates as follows: when a task invoker submits a task to Agent A (Coordinator), the agent performs task decomposition and distributes task 1 and task 2 to Agents X (Execution) and Y (Execution) respectively. Upon receiving completion notifications from both agents, Agent A aggregates the results and delivers the final output back to the originating user.

Use Cases Some typical use cases in which multiple agents work together to complete tasks:

High throughput tasks: There are many tasks that have high bandwidth requirements. For example, in the collaborative framework for coordinating heterogeneous embodied agents-specifically, the robot dogs and drones-in a wide-area public network, the drone is assigned for wide-area surveillance and task delegation, while functionally specialized robot dogs perform ground-level operations such as video surveillance, material transport and obstacle clearance.
Low latency tasks: In collaborative multi-agent systems, control signal transmission tasks impose significantly more stringent latency requirements than routine model training data transfer. For example, the home robot remotely sends an alarm message to the end user.
High reliability tasks: Smart factory scenarios require critical reliability of agent task execution and fault-tolerant operation stability.

These categories of use cases (may be further extended) demonstrate the collaboration among agents spanning multiple distinct domains to achieve end-to-end task completion. The embodied agents (such as the robots and unmanned aerial vehicles) interacting with physical environments through embodied interfaces, while virtual agents (such as the various software applications and personal assistant) providing complementary capabilities, has demonstrated the advantages of collaboratively completing complex tasks in various scenarios.

Necessity

Task Complexity As task complexity increases, heterogeneous agents require multiple interaction rounds, precise planning, ordered execution, and efficient context sharing mechanisms to enhance resolution quality and robustness.

Resource Optimization Through task coordination and resource consumption monitoring, the multi-agent systems are able to support dynamic allocation of for example, the computing, storage and bandwidth resources to optimize the resource utilization efficiency.

Quality of Service Task coordination may dynamically prioritize resources allocations based on for example, task priorities, agent expertise and Quality of Service requirements. This ensures timeliness and accuracy of critical tasks, reduces service response latency, and maintains output stability and reliability.

Dynamic Adjustment The agents may update or adjust the task during task execution phase based on end user's inputs or contextual updates to better respond to the final task requirements.

Protocol Requirements

Task Description Precise task descriptions or task templates are REQUIRED to ensure all agents maintain a consistent understanding of the objectives, operational constraints and criteria. A well-defined task description:

Reduces ambiguity: Minimizes misinterpretations and conflicting actions among agents.
Enables verifiability: Translates abstract goals into executable and measurable plans.
Improves robustness: Ensures collaboration remains coherent and efficient under dynamic conditions.

Task descriptions assigned to different agents MUST follow the minimization principle, i.e., agents SHOULD receive only the minimal, contextually necessary information required to fulfill their tasks to prevent unauthorized access of sensitive information.

Task State Upon receiving a task, the Coordinator Agent may decompose it into multiple tasks and delegate them to different Execution Agents via dynamic capability discovery mechanism. The AI Agent protocol design should support comprehensive state descriptions throughout the task execution lifecycle. For example, the definition of potential task states, (such as task submitted, running, suspended (awaiting external input or output from other agents), completed, canceled, rejected and failed) and the coordination operations (such as state queries, retrieval and push of intermediate results). Based on the length of time to complete the tasks, the task can be categorized into Short-term tasks that require a single request-response interactions and Long-term tasks that may require multi-round interactions or extended waiting periods. The Coordinator Agent may dynamically adjust the target of the task according to the intermediate results of the Execution Agents and the context information. The AI Agent protocol design should support long-term and short-term tasks coordination.

Communication Mechanism When multiple agents participate in coordinated tasks, they may need to maintain common context sharing, or subscriptions to identical message content. Different communication mechanisms, such as request/response and broadcast may need to be supported in the AI Agent protocol. In the moving large artifact task in intelligent factory case, the multiple robots responsible for the transportation needs to obtain consistent route and destination data to ensure operational coherence. The applicable communication mechanisms may be vary for different agent communication structures, which directly affects the message delivery efficiency, implementation feasibility and system complexity. The typical agent communication structures including:

Layered communication: Agents at different layers in the hierarchical structure have different roles or capabilities. Communication occurs either horizontally (intra-layer) or vertically (adjacent layers only).
Decentralized communication: Multiple agents form a peer-to-peer network permitting direct communications between any two agents.
Centralized communication: Central agents coordinate the communications, with the other agents primarily interact through the central node.
Shared message pool : Agents publish structured messages to the pool and subscribe to the message types matching their profiles.

The task collaboration mode may also affect the communication mechanism between agents:

In the primary/secondary mode, a Coordinator Agent decomposes a task into multiple tasks and distributes them to Execution Agents for processing. The Execution Agents return their execution results to the Coordinator Agent, which aggregates the results and delivers them to the end user.
In the peer negotiation mode, a Coordinator Agent distributes the task of the end user to the Execution Agents, which independently execute their assigned tasks. The Execution Agents return their results directly to the end user via the Coordinator Agent, without requiring further processing by the Coordinator Agent.
In subscription mode, a Coordinator Agent may delegate subscription-based tasks to Execution Agents. For example, a task including "book a ticket to Beijing one week before the New Year holiday" will be assigned by the Coordinator Agent which will perform the task at the specified time to the Execution Agents.

Different tasks may require long and short connections, the AI Agent protocol should be able to provide mechanisms beyond simple request/response, including the complex interaction modes for example message multicast, publication/subscription (PUB/SUB), asynchronous notifications. The AI Agent protocol design MUST consider support for relay nodes to facilitate task message forwarding. Relay nodes SHOULD prioritize message scheduling and forwarding based on task requirements to ensure efficient agent collaboration and meet transmission QoS objectives. For example, relay nodes MAY implement the following priority hierarchy (from highest to lowest):

Control signaling transmission tasks
Media stream transmission tasks
Training data stream transmission tasks

This prioritization scheme ensures that critical messages receive preferential treatment during congestion or resource contention scenarios.

Context Sharing When delegating tasks to Execution Agents, the Coordinator Agent may include task-relevant contextual about the contact information of the end user, the task itself, the historical preference information known by the Coordinator Agent, and other necessary conversation data, to facilitate the task execution. For example, in trip planning case, this may encompass historically booked flight/hotel preferences or dynamically perceived context like recent user dialog. The AI Agent protocol design should consequently support context sharing mechanisms through standardized definitions of context types, length constraints, and encoding formats to enhance the effectiveness of task execution. The context sharing MAY have an impact on privacy of the user, it is necessary to consider the limitations of the scope of context sharing, especially for the sensitive information e.g. name, age, address of the user.

Exception Handling Exception handling constitutes a critical mechanism for multi-agent collaborative task execution. If an execution agent cannot complete an assigned task due to lack of skills or overloaded, the failure in task execution may lead to such as releasing the connections.

Existing Protocol Analysis Task-oriented coordination is compatible with multiple types of existing protocols, such as TCP, HTTP and etc. Transmission Control Protocol (TCP) is to provide reliable, orderly, connection-oriented data transmission services for end-to-end communication. In the task-oriented coordination scenarios, some TCP capabilities can be directly reused (e.g. retransmission mechanism, congest control and etc.). The Hypertext Transfer Protocol (HTTP) is the application layer protocol for distributed, collaborative, hypermedia information systems. Some HTTP features are also applicable to task-oriented coordination. However, the task-oriented coordination needs to support finer-grained access control and context information anonymization mechanisms, which may need the enhancements on the protocol, defining the task-oriented coordination mechanism at the session layer has some advantages.

Conclusions Task-oriented coordination constitutes a critical function for multi-agent collaboration. This document discusses the necessity of introducing task-oriented coordination to address complex tasks, optimize resource utilization, and guarantee service quality. Consequently, it analyzes the requirements imposed by task-oriented coordination on AI Agent protocol design, specifically concerning task descriptions, task states, communication mechanisms, context sharing, and exception handling.

IANA Considerations This memo includes no request to IANA.

Security Considerations When designing the task-oriented coordination for AI agents communication, privacy should always be considered.