<?xml version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>  

<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="info"
  docName="draft-cui-ai-agent-task-00"
  ipr="trust200902"
  obsoletes=""
  updates=""
  submissionType="IETF"
  xml:lang="en"
  version="3">

  <front>
    <title abbrev="Agent Task Coordination">Task-oriented Coordination Requirements for AI Agent Protocols</title>
    <seriesInfo name="Internet-Draft" value="draft-cui-ai-agent-task-00"/>

    <author initials="Y." surname="Cui" fullname="Yong Cui">
      <organization>Tsinghua University</organization>
      <address>
        <postal>
          <region>Beijing</region>
          <code>100084</code>
          <country>China</country>
        </postal>
        <email>cuiyong@tsinghua.edu.cn</email>
        <uri>http://www.cuiyong.net/</uri>
      </address>
    </author>
    
    <author initials="C." surname="Du" fullname="Chenguang Du">
      <organization>Zhongguancun Laboratory</organization>
      <address>
        <postal>
          <region>Beijing</region>
          <code>100094</code>
          <country>China</country>
        </postal>
        <email>ducg@zgclab.edu.cn</email>
      </address>
    </author>
   
    <date year="2025" month="July" day="7"/>

    <area>General</area>
    <workgroup>Network Working Group</workgroup>
    
    <keyword>AI agent</keyword>
    <keyword>Agent communication</keyword>
    <keyword>Task coordination</keyword>

    <abstract>
      <t>AI agent communication requires intelligent task level coordination to manage dynamic workloads across large-scale, heterogeneous networking environments. This draft proposes general requirements for an agent protocol to enable autonomous task coordination at scale, including dynamic task discovery, negotiation, and context-aware scheduling with real-time adaptability. </t>
    </abstract>
 
  </front>

  <middle>
    

    <section>
      <name>Introduction</name>
      <section>
        <name>Purpose</name>
        <t>With the rapid advancements of AI technologies and their applications, AI agents utilizing Large Language Models (LLMs) have emerged as a pivotal direction in global technological evolution and market development. The single-agent systems exhibit inherent limitations when addressing complex tasks in dynamic environments, the efficient multi-agent collaboration for complex task completion has garnered increasing attention, wherein task-oriented coordination constitutes a critical component of standardized multi-agent systems.</t>
        <t>This document examines the requirements for standardizing AI Agent protocols to support task coordination in multi-agent systems.</t>
      </section>
      
      <section>
        <name>Terminology</name>
        <dl newline="true">
          <dt>Task:</dt>
          <dd>ISO/IEC 22989, task is actions required to achieve a specific goal. These actions can be physical or cognitive. For instance, computing or creation of predictions, translations, synthetic data or artefacts or navigating through a physical space.</dd>
          <dt>Shared Message Pool:</dt>
          <dd>A pool where agents publish structured messages and subscribe to relevant messages based on their profiles.</dd>
          <dt>Coordinator Agent:</dt>
          <dd>An agent that receives tasks and decomposes or distributes tasks to other agents.</dd>
          <dt>Execution Agent:</dt>
          <dd>An agent responsible for executing tasks distributed by the Coordinator Agent.</dd>
          <dt>Normative Language:</dt>
          <dd>The key words "MUST",  "REQUIRED", and "SHOULD" in this document are to be interpreted as described in <xref target="RFC2119"/>.</dd>
        </dl>
      </section>
    <section>
    <name>Task Coordination Framework</name>
<figure title="Task Coordination Framework" anchor="fig-frame"><artwork><![CDATA[
                                                          +---------+
                                     +------------------->| Agent X |
                                     |2.Task1 distributed +---------+
                                     |                        |     
                                     |                        |  
                                     |                        |
  +---------+  1.Task submitted  +---------+ 3.Task1 completed|
  |   Task  |------------------->|         |<-----------------+      
  | Invoker |<------------------ | Agent A |<-----------------+
  +---------+  4.Task completed  +---------+ 3.Task2 completed|
                                     |                        |     
                                     |                        |  
                                     |                        |
                                     |2.Task2 distributed +---------+
                                     +------------------->| Agent Y |
                                                          +---------+
                                                 
]]></artwork></figure>
    <t>The system operates as follows: when a task invoker submits a task to Agent A (Coordinator), the agent performs task decomposition and distributes task 1 and task 2 to Agents X (Execution) and Y (Execution) respectively. Upon receiving completion notifications from both agents, Agent A aggregates the results and delivers the final output back to the originating user.</t>
    </section>
    </section>
      
    <section>
      <name>Use Cases</name>
      <t>Some typical use cases in which multiple agents work together to complete tasks:</t>
      <ul spacing="normal">
        <li>High throughput tasks: There are many tasks that have high bandwidth requirements. For example, in the collaborative framework for coordinating heterogeneous embodied agents-specifically, the robot dogs and drones-in a wide-area public network, the drone is assigned for wide-area surveillance and task delegation, while functionally specialized robot dogs perform ground-level operations such as video surveillance, material transport and obstacle clearance.</li>
        <li>Low latency tasks: In collaborative multi-agent systems, control signal transmission tasks impose significantly more stringent latency requirements than routine model training data transfer. For example, the home robot remotely sends an alarm message to the end user.</li>
        <li>High reliability tasks: Smart factory scenarios require critical reliability of agent task execution and fault-tolerant operation stability.</li>
      </ul>
      <t>These categories of use cases (may be further extended) demonstrate the collaboration among agents spanning multiple distinct domains to achieve end-to-end task completion. The embodied agents (such as the robots and unmanned aerial vehicles) interacting with physical environments through embodied interfaces, while virtual agents (such as the various software applications and personal assistant) providing complementary capabilities, has demonstrated the advantages of collaboratively completing complex tasks in various scenarios.</t>
    </section>
      
    <section>
      <name>Necessity</name>
      <section>
        <name>Task Complexity</name>
        <t>As task complexity increases, heterogeneous agents require multiple interaction rounds, precise planning, ordered execution, and efficient context sharing mechanisms to enhance resolution quality and robustness.</t>
      </section>
      <section>
        <name>Resource Optimization</name>
        <t>Through task coordination and resource consumption monitoring, the multi-agent systems are able to support dynamic allocation of for example, the computing, storage and bandwidth resources to optimize the resource utilization efficiency.</t>
      </section>
      <section>
        <name>Quality of Service</name>
        <t>Task coordination may dynamically prioritize resources allocations based on for example, task priorities, agent expertise and Quality of Service requirements. This ensures timeliness and accuracy of critical tasks, reduces service response latency, and maintains output stability and reliability.</t>
      </section>
      <section>
        <name>Dynamic Adjustment</name>
        <t>The agents may update or adjust the task during task execution phase based on end user's inputs or contextual updates to better respond to the final task requirements.</t>
      </section>
    </section>
    
    <section>
      <name>Protocol Requirements</name>
      <section>
        <name>Task Description</name>
        <t>Precise task descriptions or task templates are REQUIRED to ensure all agents maintain a consistent understanding of the objectives, operational constraints and criteria.</t>
        <t>A well-defined task description:</t>
        <ul spacing="normal">
          <li>Reduces ambiguity: Minimizes misinterpretations and conflicting actions among agents.</li>
          <li>Enables verifiability: Translates abstract goals into executable and measurable plans.</li>
          <li>Improves robustness: Ensures collaboration remains coherent and efficient under dynamic conditions.</li>
        </ul>
        <t>Task descriptions assigned to different agents MUST follow the minimization principle, i.e., agents SHOULD receive only the minimal, contextually necessary information required to fulfill their tasks to prevent unauthorized access of sensitive information.</t>
      </section>
      <section>
        <name>Task State</name>
        <t>Upon receiving a task, the Coordinator Agent may decompose it into multiple tasks and delegate them to different Execution Agents via dynamic capability discovery mechanism. The AI Agent protocol design should support comprehensive state descriptions throughout the task execution lifecycle. For example, the definition of potential task states, (such as task submitted, running, suspended (awaiting external input or output from other agents), completed, canceled, rejected and failed) and the coordination operations (such as state queries, retrieval and push of intermediate results).</t>
        <t>Based on the length of time to complete the tasks, the task can be categorized into Short-term tasks that require a single request-response interactions and Long-term tasks that may require multi-round interactions or extended waiting periods. The Coordinator Agent may dynamically adjust the target of the task according to the intermediate results of the Execution Agents and the context information. The AI Agent protocol design should support long-term and short-term tasks coordination.</t>
      </section>
      <section>
        <name>Communication Mechanism</name>
        <t>When multiple agents participate in coordinated tasks, they may need to maintain common context sharing, or subscriptions to identical message content. Different communication mechanisms, such as request/response and broadcast may need to be supported in the AI Agent protocol. In the moving large artifact task in intelligent factory case, the multiple robots responsible for the transportation needs to obtain consistent route and destination data to ensure operational coherence.</t>
        <t>The applicable communication mechanisms may be vary for different agent communication structures, which directly affects the message delivery efficiency, implementation feasibility and system complexity. The typical agent communication structures <xref target="Multi-Agents"/> including:</t>
        <ul spacing="normal">
          <li>Layered communication: Agents at different layers in the hierarchical structure have different roles or capabilities. Communication occurs either horizontally (intra-layer) or vertically (adjacent layers only).</li>
          <li>Decentralized communication: Multiple agents form a peer-to-peer network permitting direct communications between any two agents.</li>
          <li>Centralized communication: Central agents coordinate the communications, with the other agents primarily interact through the central node.</li>
          <li>Shared message pool <xref target="MetaGPT"/>: Agents publish structured messages to the pool and subscribe to the message types matching their profiles.</li>
        </ul>
        <t>The task collaboration mode may also affect the communication mechanism between agents:</t>
        <ul spacing="normal">
          <li>In the primary/secondary mode, a Coordinator Agent decomposes a task into multiple tasks and distributes them to Execution Agents for processing. The Execution Agents return their execution results to the Coordinator Agent, which aggregates the results and delivers them to the end user.</li>
          <li>In the peer negotiation mode, a Coordinator Agent distributes the task of the end user to the Execution Agents, which independently execute their assigned tasks. The Execution Agents return their results directly to the end user via the Coordinator Agent, without requiring further processing by the Coordinator Agent.</li>
          <li>In subscription mode, a Coordinator Agent may delegate subscription-based tasks to Execution Agents. For example, a task including "book a ticket to Beijing one week before the New Year holiday" will be assigned by the Coordinator Agent which will perform the task at the specified time to the Execution Agents.</li>
        </ul>
        <t>Different tasks may require long and short connections, the AI Agent protocol should be able to provide mechanisms beyond simple request/response, including the complex interaction modes for example message multicast, publication/subscription (PUB/SUB), asynchronous notifications.</t>
        <t>The AI Agent protocol design MUST consider support for relay nodes to facilitate task message forwarding.  Relay nodes SHOULD prioritize message scheduling and forwarding based on task requirements to ensure efficient agent collaboration and meet transmission QoS objectives.</t>
        <t>For example, relay nodes MAY implement the following priority hierarchy (from highest to lowest):</t>
        <ul spacing="normal">
          <li>Control signaling transmission tasks</li>
          <li>Media stream transmission tasks</li>
          <li>Training data stream transmission tasks</li>
         </ul>
         <t>This prioritization scheme ensures that critical messages receive preferential treatment during congestion or resource contention scenarios.</t>
      </section>
      <section>
        <name>Context Sharing</name>
        <t>When delegating tasks to Execution Agents, the Coordinator Agent may include task-relevant contextual about the contact information of the end user, the task itself, the historical preference information known by the Coordinator Agent, and other necessary conversation data, to facilitate the task execution. For example, in trip planning case, this may encompass historically booked flight/hotel preferences or dynamically perceived context like recent user dialog. The AI Agent protocol design should consequently support context sharing mechanisms through standardized definitions of context types, length constraints, and encoding formats to enhance the effectiveness of task execution.</t>
        <t>The context sharing MAY have an impact on privacy of the user, it is necessary to consider the limitations of the scope of context sharing, especially for the sensitive information e.g. name, age, address of the user.</t>
      </section>
      <section>
        <name>Exception Handling</name>
        <t>Exception handling constitutes a critical mechanism for multi-agent collaborative task execution. If an execution agent cannot complete an assigned task due to lack of skills or overloaded, the failure in task execution may lead to such as releasing the connections.</t>
      </section>
    </section>

    <section>
      <name>Existing Protocol Analysis</name>
      <t>Task-oriented coordination is compatible with multiple types of existing protocols, such as TCP, HTTP and etc.</t>
      <t>Transmission Control Protocol (TCP) is to provide reliable, orderly, connection-oriented data transmission services for end-to-end communication. In the task-oriented coordination scenarios, some TCP capabilities can be directly reused (e.g. retransmission mechanism, congest control and etc.).</t>
      <t>The Hypertext Transfer Protocol (HTTP) is the application layer protocol for distributed, collaborative, hypermedia information systems. Some HTTP features are also applicable to task-oriented coordination.</t>
      <t>However, the task-oriented coordination needs to support finer-grained access control and context information anonymization mechanisms, which may need the enhancements on the protocol, defining the task-oriented coordination mechanism at the session layer has some advantages.</t>
    </section>
    
    <section>
      <name>Conclusions</name>
      <t>Task-oriented coordination constitutes a critical function for multi-agent collaboration. This document discusses the necessity of introducing task-oriented coordination to address complex tasks, optimize resource utilization, and guarantee service quality. Consequently, it analyzes the requirements imposed by task-oriented coordination on AI Agent protocol design, specifically concerning task descriptions, task states, communication mechanisms, context sharing, and exception handling.</t>
    </section>
    
    <section anchor="IANA">
      <name>IANA Considerations</name>
      <t>This memo includes no request to IANA.</t>
    </section>
    
    <section anchor="Security">
      <name>Security Considerations</name>
      <t>When designing the task-oriented coordination for AI agents communication, privacy should always be considered.</t>
    </section>
    
  </middle>

  <back>
    <references anchor="sec-combined-references">
      <name>References</name>
      <references anchor="sec-normative-references">
        <name>Normative References</name>
        <reference anchor="RFC2119">
          <front>
            <title>Key words for use in RFCs to Indicate Requirement Levels</title>
            <author fullname="S. Bradner" initials="S." surname="Bradner"/>
            <date month="March" year="1997"/>
            <abstract>
              <t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="2119"/>
          <seriesInfo name="DOI" value="10.17487/RFC2119"/>
        </reference>
      </references>
      <references anchor="sec-informative-references">
        <name>Informative References</name>
        <reference anchor="Multi-Agents">
          <front>
            <title>Large Language Model based Multi-Agents: A Survey of Progress and Challenges</title>
            <author initials="T." surname="Guo">               </author>
            <author initials="X." surname="Chen">              </author>
            <author initials="Y." surname="Wang">              </author>
            <author initials="R.i" surname="Chang">            </author>
            <author initials="S." surname="Pei">               </author>
            <author initials="N.V." surname="Chawla">          </author>
            <author initials="O." surname="Wiest">             </author>
            <author initials="X." surname="Zhang">             </author>
            <date year="2024"/>
          </front>
        </reference>      
        <reference anchor="MetaGPT">
          <front>
            <title>MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework</title>
            <author initials="S." surname="Hong">               </author>
            <author initials="M." surname="Zhuge">              </author>
            <author initials="J." surname="Chen">               </author>
            <author initials="X." surname="Zheng">              </author>
            <author initials="Y." surname="Cheng">              </author>
            <author initials="C." surname="Zhang">              </author>
            <author initials="J." surname="Wang">               </author>
            <author initials="Z." surname="Wang">               </author>
            <author initials="S." surname="Yau">                </author>
            <author initials="Z." surname="Lin">                </author>
            <author initials="L." surname="Zhou">               </author>
            <author initials="C." surname="Ran">                </author>
            <author initials="L." surname="Xiao">               </author>
            <author initials="C." surname="Wu">                 </author>
            <author initials="J." surname="Schmidhuber">        </author>
            <date year="2023"/>
          </front>
        </reference>    
      </references>
    </references>
    
 </back>
</rfc>