<?xml version="1.0" encoding="US-ASCII"?>
<!-- edited with XMLSPY v5 rel. 3 U (http://www.xmlspy.com)
     by Daniel M Kohn (private) -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
]>
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-yang-dmsc-ioa-task-protocol-01"
     ipr="trust200902">
  <front>
    <title abbrev="IoA Task Protocol">Internet of Agents Task Protocol (IoA
    Task Protocol) for Heterogeneous Agent Collaboration</title>

    <author fullname="Cheng Yang" initials="C" surname="Yang">
      <organization>Beijing University of Posts and
      Telecommunications</organization>

      <address>
        <postal>
          <street>10 Xitucheng Road, Haidian District</street>

          <city>Beijing</city>

          <region>Beijing</region>

          <code>100876</code>

          <country>China</country>
        </postal>

        <email>yangcheng@bupt.edu.cn</email>
      </address>
    </author>

    <author fullname="Zhiyuan Liu" initials="Z" surname="Liu">
      <organization>Tsinghua University</organization>

      <address>
        <postal>
          <street>30 Shuangqing Road, Haidian District</street>

          <city>Beijing</city>

          <region>Beijing</region>

          <code>100084</code>

          <country>China</country>
        </postal>

        <email>liuzy@tsinghua.edu.cn</email>
      </address>
    </author>

    <author fullname="Aijun Wang" initials="A" surname="Wang">
      <organization>China Telecom</organization>

      <address>
        <postal>
          <street>Beiqijia Town, Changping District</street>

          <city>Beijing</city>

          <region>Beijing</region>

          <code>102209</code>

          <country>China</country>
        </postal>

        <email>wangaj3@chinatelecom.cn</email>
      </address>
    </author>

    <date day="27" month="January" year="2026"/>

    <area>ART Area</area>

    <workgroup>Dispatch Working Group</workgroup>

    <keyword>IoA Task Protocol</keyword>

    <abstract>
      <t>This draft defines a new agent collaboration protocol, named the
      Internet of Agents Task Protocol (IoA Task Protocol), to support
      distributed, heterogeneous agent collaboration in intelligent systems.
      The IoA Task Protocol enables dynamic team formation, adaptive task
      coordination, and structured communication among agents with diverse
      architectures, tools, and knowledge sources. Through a layered
      architecture and extensible message format, it supports decentralized
      deployment across devices and can interoperate with existing frameworks.
      The protocol is particularly suited to large-scale intelligent
      collaboration scenarios&mdash;such as intelligent transportation, smart
      healthcare, and large-scale human&ndash;AI teaming&mdash;across
      heterogeneous network environments, including fixed networks,
      edge&ndash;cloud infrastructures, and emerging mobile networks such as
      6G.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>With the rapid advancement of large language models (LLMs) and
      multimodal autonomous agents, modern intelligent systems are
      increasingly constructed as collaborative networks of multiple agents.
      These agents are expected to work together to solve complex, open-ended
      tasks. However, they often differ in capabilities, tools, runtime
      environments, and communication patterns, leading to significant
      challenges in interoperability, dynamic coordination, and cross-device
      deployment. As a result, current multi-agent frameworks fall short of
      the flexibility and generality required in real-world applications.</t>

      <t>In a typical collaborative setting shown in Figure 1, agents with
      specialized functions&mdash;including on-device AI Agents on Device A
      for conceptual planning, Device B for academic search, Device C for
      content generation, and Device D for document analysis&mdash;must work
      together to complete a research paper on &ldquo;Internet of
      Agents.&rdquo; These agents are distributed across devices (e.g.,
      laptops, edge nodes, cloud services), and each relies on different
      execution frameworks or data formats.<figure>
          <artwork align="center"><![CDATA[
+---------------------------------------------------------------------------------------+
|                                                                                       |
|                 Task: Write a research paper on "Internet of Agents"                  |
|                                                                                       |
|        +-----------------+        +-----------------+        +-----------------+      |
|        | On-Device AI    |<------>| On-Device AI    |<------>| On-Device AI    |      |
|        | Agent (Device A)|        | Agent (Device B)|        | Agent (Device C)|      |
|        +-----------------+        +-----------------+        +-----------------+      |
|                    \                     |                       /                    |
|                     \                    |                      /                     |
|                      \                   |                     /                      |
|                       \                  |                    /                       |
|                        \                 |                   /                        |
|                         \                |                  /                         |
|                          +---------------+-----------------+                          |
|                                          |                                            |
|                                 +-----------------+                                   |
|                                 | On-Device AI    |                                   |
|                                 | Agent (Device D)|                                   |
|                                 +-----------------+                                   |
|                                          |                                            |
|                             +------------+-------------+                              |
|                             |                          |                              |
|                             |       IoA Server         |                              |
|                             |                          |                              |
|                             +--------------------------+                              |
|                                                                                       |
+---------------------------------------------------------------------------------------+
                      Figure 1: Multi-agent collaboration scenario
                      
]]></artwork>
        </figure></t>

      <t>When Device B encounters a specialized PDF parsing task beyond its
      capability, existing frameworks often fail to dynamically recruit Device
      D due to rigid team formation rules. Likewise, when Device A and Device
      C attempt to synchronize intermediate results in real time, inflexible
      communication channels may result in delays or dropped information.</t>

      <t>Existing solutions exhibit several key limitations:<list
          style="symbols">
          <t>Closed frameworks that restrict integration with third-party
          agents such as AutoGPT or Open Interpreter;</t>

          <t>Single-device simulation that fails to reflect cross-device
          deployment scenarios typical in edge-cloud collaboration;</t>

          <t>Hard-coded workflows that prevent agents from switching between
          synchronous and asynchronous task execution at runtime.</t>
        </list></t>

      <t>To address these challenges, this draft introduces the Internet of
      Agents Task Protocol (IoA Task Protocol)&mdash;a layered, extensible
      collaboration standard designed for intelligent multi-agent systems. The
      core goal of the protocol is to enable seamless collaboration among
      heterogeneous agents across devices, tools, and execution environments.
      It supports:<list style="symbols">
          <t>Agent integration via a standardized interface and registration
          mechanism;</t>

          <t>Dynamic team formation across distributed environments;</t>

          <t>Finite-state machine-based session control for flexible and
          autonomous dialogue management;</t>

          <t>Structured message formats with group routing, task assignment,
          and response coordination.</t>
        </list></t>

      <t>The design of the IoA Task Protocol aligns naturally with the
      evolution of intelligent networked systems, including fixed networks and
      next-generation mobile networks such as 6G, which aim to support
      ubiquitous intelligence through large-scale, low-latency, and
      semantic-driven communication. By enabling agent collaboration across
      fixed-network infrastructures, edge devices, mobile terminals, and cloud
      nodes, IoA supports coordinated intelligence across heterogeneous
      network environments, including both fixed networks and mobile networks
      such as 6G. Its structured message design, dynamic team formation, and
      abstracted dialogue control provide a foundational protocol framework
      for orchestrating intelligent services across heterogeneous network
      infrastructures, including fixed networks and future mobile networks
      such as 6G.</t>
    </section>

    <section title="Conventions used in this document">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref target="RFC2119"/>
      .</t>
    </section>

    <section title="Terminology">
      <t>The following terms are defined in this draft:<list style="symbols">
          <t>IoA: Internet of Agents, a protocol enabling distributed
          collaboration among heterogeneous agents across devices and 6G
          networks, defined in <xref target="sec-ioa-methods"/></t>

          <t>Agent Registry Block: A server-side module storing structured
          capability descriptions of all registered agents, supporting
          semantic search for team formation, defined in <xref
          target="sec-ioa-methods"/></t>

          <t>Team Formation Block: A client-side module responsible for
          initiating, joining, or disbanding agent teams based on task
          requirements, including nested sub-teams, defined in <xref
          target="sec-ioa-methods"/></t>

          <t>Session State Machine: A finite-state model governing
          collaboration states (Discussion, Synchronous Task Assignment,
          Asynchronous Task Assignment, Pause and Trigger, Conclusion) for
          adaptive dialogue management, defined in <xref
          target="sec-ioa-methods"/></t>

          <t>HTTP: Hypertext Transfer Protocol, a application-layer protocol
          for distributed, collaborative, hypermedia information systems,
          referenced in IoA for interoperability with web-based agents,
          defined in <xref target="RFC9110"/></t>

          <t>JSON-RPC: A remote procedure call protocol encoded in JSON,
          referenced in IoA for structured communication between web-based
          agents, defined in <xref target="RFC8259"/></t>

          <t>QUIC: A transport layer protocol providing secure, low-latency
          communication over UDP, used in IoA for real-time agent messaging,
          defined in <xref target="RFC9000"/></t>
        </list></t>
    </section>

    <section anchor="sec-ioa-methods" title="IOA Methods">
      <section title="IoA Architecture">
        <t>The Internet of Agents Task Protocol (IoA Task Protocol) enables
        distributed collaboration among heterogeneous agents through a layered
        architecture and distributed communication protocol. It supports
        seamless integration across devices, toolchains, and runtime
        environments.</t>

        <t>The IoA system adopts a three-layer architecture implemented
        symmetrically at both the server and client side:<list style="symbols">
            <t>Server-side: Handles global coordination, agent discovery,
            group management, and message routing.</t>

            <t>Client-side: Encapsulates individual agents and provides
            interfaces for team collaboration and local task execution.</t>
          </list></t>

        <t>An overview of the layered structure is shown in Figure 2.</t>

        <t><figure>
            <artwork align="center"><![CDATA[
+----------------------------------------------------------------------------+ +----------------------------------------------------------------------------+
|                                   Server                                   | |                                   Client                                   |
|----------------------------------------------------------------------------| |----------------------------------------------------------------------------|
| Interaction Layer:                                                         | | Interaction Layer:                                                         |
|   - Agent Query Block: Handles semantic agent search queries               | |   - Team Formation Block: Forms/join teams for assigned goals              |
|   - Group Setup Block: Manages group/team creation                         | |   - Communication Block: Handles chat messaging and event updates          |
|   - Message Routing Block: Routes messages within chat groups              | |----------------------------------------------------------------------------|
|----------------------------------------------------------------------------| | Data Layer:                                                                |
| Data Layer:                                                                | |   - Agent Contact Block: Caches past collaborators                         |
|   - Agent Registry Block: Stores capability descriptions of all agents     | |   - Group Info Block: Stores task metadata and group state                 |
|   - Session Management Block: Tracks WebSocket sessions and group states   | |   - Task Management Block: Tracks subtasks, assignment, and progress       |
|----------------------------------------------------------------------------| |----------------------------------------------------------------------------|
| Foundation Layer:                                                          | | Foundation Layer:                                                          |
|   - Data Infra Block: Vector database (e.g., Milvus) for semantic search   | |   - Agent Integration Block: Adapter for third-party agents                |
|   - Network Infra Block: WebSocket infrastructure                          | |   - Data Infra Block: Local DB (e.g., SQLite)                              |
|   - Security Block: Authentication and permission control                  | |   - Network Infra Block: WebSocket-based communication                     |
+----------------------------------------------------------------------------+ +----------------------------------------------------------------------------+

                  Figure 2: Layered architecture of IoA system
]]></artwork>
          </figure></t>
      </section>

      <section title="Heterogeneous Agent Integration">
        <t>IoA supports the integration of heterogeneous agents from diverse
        sources through a unified interface, including third-party agents such
        as AutoGPT, Open Interpreter, and embodied robotic agents.</t>

        <t>When a new agent joins the IoA, its client wrapper undergoes a
        registration process with the server. During this registration, the
        agent is expected to provide a comprehensive description of its
        capabilities, skills, and domains of expertise. For an agent c_i, its
        description is denoted as d_i, and is stored in the Agent Registry
        Block within the Data Layer of the server.</t>

        <t>The set of all registered agents is denoted as C = {c&#8321;,
        c&#8322;, ..., c&#8345;}, where each c_i is associated with its
        capability description d_i. This mechanism enables future semantic
        matching and intelligent task allocation.</t>
      </section>

      <section title="Autonomous Team Formation">
        <t>Agents initiate the search process by submitting capability
        requirements to the Agent Query Block. The server performs semantic
        matching using vector similarity and returns candidate agents from the
        Agent Registry Block.</t>

        <t>IoA supports nested team structures. An initial group is formed for
        the main goal, and subgroups are recursively created if subtasks
        require new capabilities. This forms a hierarchical tree structure,
        reducing communication complexity and organizational overhead.</t>

        <t>The entire team formation process is autonomous, task-driven,
        device-agnostic, and self-organizing.</t>
      </section>

      <section title="Session and Task Management Method">
        <t>IoA models group conversations and collaboration using a
        finite-state machine with five abstract states:<list style="symbols">
            <t>Discussion: Agents engage in general dialogue, exchange ideas,
            and clarify task require ments;</t>

            <t>Synchronous task assignment: Tasks are assigned to specific
            agents, pausing the group chat until completion;</t>

            <t>Asynchronous task assignment: Tasks are assigned without
            interrupting the ongoing discus sion;</t>

            <t>Pause &amp; trigger: The group chat is paused, waiting for the
            completion of specified asyn chronous tasks;</t>

            <t>Conclusion: Marks the end of the collaboration, prompting a
            final summary.</t>
          </list></t>

        <t>State transitions are managed autonomously by a coordinator agent
        using the conversation history and session context to determine the
        next state and speaker.</t>
      </section>

      <section title="Message Protocol Overview">
        <t>The agent message protocol in IoA is designed for extensibility and
        flexibility, enabling effective collaboration among heterogeneous
        agents. Each message consists of two main parts: a header and a
        payload.</t>

        <t>The header contains essential metadata to ensure proper routing and
        processing. Key fields include:<list style="symbols">
            <t>sender: The unique identifier of the agent sending the
            message.</t>

            <t>state: The current collaboration state associated with the
            message.</t>

            <t>group_id: The identifier of the group chat to which the message
            belongs.</t>
          </list></t>

        <t>The common header fields shared by all message types are
        illustrated in Figure 3.</t>

        <t><figure>
            <artwork align="center"><![CDATA[
+--------------------------+
|         Header           |
+--------------------------+
| sender: str              |
| state: enum              |
| group_id: str            |
+--------------------------+
       Figure 3: Common header fields in IoA Message Protocol
]]></artwork>
          </figure></t>

        <t>The payload carries the main content of the message and varies
        depending on message type. Common fields include:</t>

        <t><list style="symbols">
            <t>message_type: Indicates the purpose of the message (e.g.,
            discussion, task assignment, pause and trigger).</t>

            <t>next_speaker: The identifier(s) of the agent(s) expected to
            respond.</t>
          </list></t>

        <t>The full structure of the message format is illustrated in Figure
        4.</t>

        <t><figure>
            <artwork align="center"><![CDATA[
+-----------------------------+   +-----------------------------+
|  Autonomous Team Formation  |   |      Task Assignment        |
+-----------------------------+   +-----------------------------+
| goal: str                   |   | task_id: str                |
| team_members: list[str]     |   | task_desc: str              |
| team_up_depth: int          |   | task_conclusion: str        |
| max_turns: int              |   | task_abstract: str          |
+-----------------------------+   +-----------------------------+
+-----------------------------+   +-----------------------------+
|         Discussion          |   |    Pause & Trigger          |
+-----------------------------+   +-----------------------------+
| content: str                |   | triggers: list[str]         |
| type: enum                  |   +-----------------------------+
| next_speaker: list[str]     |
+-----------------------------+
       Figure 4: Structure of IoA Message Protocol   
]]></artwork>
          </figure></t>
      </section>
    </section>

    <section title="Positioning of the IoA Task Protocol in the Network Layering System">
      <t>From an architectural perspective, the IoA Task Protocol is
      positioned at the application layer, built on top of transport and
      session protocols such as TCP, UDP, WebSocket, and QUIC. This
      positioning allows IoA to remain independent of underlying network
      technologies and enables deployment across heterogeneous networking
      environments, including fixed networks, edge&ndash;cloud
      infrastructures, and mobile networks.</t>

      <t>From the perspective of functional mapping, the corresponding
      relationship between IoA's three-layer architecture and the computer
      network layers is as follows:</t>

      <t><list style="symbols">
          <t>Interaction Layer &rarr; Maps to the application layer,
          responsible for high-level logic such as message protocols, group
          collaboration, and session state transitions.</t>

          <t>Data Layer &rarr; Spans the application layer and session layer,
          managing agent states, group metadata, and context tracking.</t>

          <t>Foundation Layer &rarr; Corresponds to the transport layer and
          system infrastructure, including secure communication channels
          (e.g., WebSocket/QUIC), databases, and network service modules.</t>
        </list></t>

      <t>Since the IoA Task Protocol involves intelligent behaviors such as
      agent orchestration, semantic-driven interaction, and session control,
      an intelligence layer can be introduced above the traditional
      application layer. This layer encapsulates core intelligent
      collaboration logic&mdash;such as semantic-based agent matching,
      AI-driven session strategy optimization, dynamic task decomposition, and
      team reorganization&mdash;into standardized message formats. This layer
      shields upper-layer applications and lower-layer protocols from the
      complexity of intelligent decision-making, enabling them to focus on
      their core functions without concerning themselves with the details of
      how intelligence is implemented (e.g., scenario-specific task execution
      at the application layer, reliable data transmission at the transport
      layer). Its advantages are reflected in: standardizing the collaboration
      of heterogeneous agents, reducing integration costs across diverse
      deployment environments; improving communication efficiency through
      semantic compression and adaptive feature optimization; and enabling
      modular extensibility to support new intelligent behaviors and emerging
      application scenarios.</t>
    </section>

    <section title="Relation to the A2A Protocol">
      <t>The Agent-to-Agent (A2A) protocol is a communication standard
      designed to support standardized, secure, and modality-agnostic
      interaction between AI agents. Built upon existing web technologies such
      as HTTP, Server-Sent Events (SSE), and JSON-RPC, A2A emphasizes default
      security, support for long-running tasks, and cross-modality
      interoperability. It introduces the concept of an AgentCard to describe
      agent capabilities, enabling effective discovery and invocation.</t>

      <t>The Internet of Agents (IoA) Task protocol shares the same
      fundamental goal with A2A: to break down communication barriers among
      agents and improve the overall efficiency of multi-agent systems. Both
      protocols rely on network communication technologies and adopt similar
      approaches to message encoding, decoding, and task coordination.</t>

      <t>However, the two protocols diverge significantly in terms of design
      philosophy and core mechanisms:</t>

      <t><list style="symbols">
          <t>A2A focuses on enabling standardized communication through
          web-native technologies, effectively creating a "free trade zone"
          for agents where interoperability is built-in. In contrast, the IoA
          Task Protocol draws inspiration from Internet architecture and
          targets the problem of ecosystem fragmentation. It establishes a
          system-level collaboration platform where heterogeneous agents can
          freely register, discover one another, and collaborate across
          platforms and devices.</t>

          <t>A2A is based on HTTP and JSON-RPC for communication, combined
          with task lifecycle management and capability discovery through
          AgentCard. IoA, on the other hand, offers a more comprehensive
          collaboration framework, including agent registration, autonomous
          nested team formation, finite-state-machine-driven session control,
          and trigger-based task coordination.</t>

          <t>While A2A is suitable for standardized task responses and
          streaming updates, it lacks native support for dynamic session
          management and nested subtask structures. IoA enables adaptive
          interaction flow via a session state machine, and its team_up_depth
          field supports recursive team formation and state
          transitions&mdash;making it more effective for handling complex and
          evolving task scenarios.</t>
        </list></t>

      <t>In summary, A2A is well-suited for lightweight, standardized task
      interfaces, whereas IoA provides a more flexible and system-oriented
      protocol for large-scale, heterogeneous, and dynamic multi-agent
      collaboration. The two protocols can complement each other at different
      layers, jointly advancing the development of agent communication
      technologies.</t>
    </section>

    <section title="Future Enhancements across Heterogeneous Networks">
      <t>To fully realize the potential of intelligent systems operating
      across heterogeneous network environments&mdash;including fixed networks
      and next-generation mobile networks such as 6G&mdash;the Internet of
      Agents Task Protocol (IoA Task Protocol) requires continuous
      architectural evolution and standardization. This section outlines key
      directions for future enhancements to improve scalability,
      decentralization, interoperability, and network integration.</t>

      <section title="Distributed Agent Registration and Discovery">
        <t>The current IoA design relies on a centralized server model, which
        may limit scalability and introduce single points of failure under
        large-scale deployment. A promising direction is to adopt a
        decentralized registration and discovery mechanism, where agents can
        publish their capabilities to a shared registry accessible via a
        network-accessible web-based interface. Inspired by Domain Name System
        (DNS) and search engines, agents could be discoverable through
        keyword-based or semantic search at scale, enabling lightweight
        browser-based or API-based discovery across domains.</t>

        <t>This decentralized lookup layer would allow IoA to support
        scenarios where agents operate across multiple domains, owners, and
        physical networks, while still maintaining secure and authenticated
        interaction through digital signatures and trust mechanisms.</t>
      </section>

      <section title="Enhanced Scalability and Fault Tolerance">
        <t>To scale beyond millions of agents, the IoA Task Protocol should
        adopt sharding and region-based message routing. Distributed
        registries and dynamic load balancing can reduce latency and avoid
        bottlenecks. Caching of frequent agent metadata at edge nodes is also
        critical for fast retrieval in latency-sensitive deployment
        scenarios.</t>
      </section>

      <section title="Semantic Interoperability and Ontology Alignment">
        <t>In highly heterogeneous environments, agents may describe their
        capabilities using different terminologies. To address this, the IoA
        Task Protocol should support ontology mapping and alignment
        mechanisms. This allows agents with differing skill descriptors to
        still interoperate, using shared or translated task definitions during
        team formation and dialogue.</t>
      </section>

      <section title="Security and Privacy Enhancements">
        <t>For mission-critical 6G scenarios (e.g., autonomous vehicles,
        medical AI), the protocol must incorporate stronger security
        primitives. This includes: <list style="symbols">
            <t>End-to-end encryption with forward secrecy.</t>

            <t>Support for zero-trust architectures with agent attestation and
            secure enclaves.</t>

            <t>Fine-grained access control based on agent role and session
            context.</t>
          </list></t>
      </section>
    </section>

    <section title="Security Considerations">
      <t>IoA servers and agents store sensitive data including capability
      descriptors, session state metadata, and task execution logs, which
      consume memory and computational resources. To mitigate risks of
      resource exhaustion and unauthorized access, <xref target="RFC6749"/>
      (OAuth 2.0) mandates that IoA entities must authenticate peers via
      token-based validation before processing registration requests or
      collaboration messages. Additionally, all data transmission between
      entities must use TLS 1.3 as specified in <xref target="RFC8446"/> to
      ensure confidentiality and integrity, preventing eavesdropping or
      tampering.</t>
    </section>

    <section title="IANA Considerations">
      <t>[TBD] This document defines a new protocol for heterogeneous agent
      collaboration: the Internet of Agents Task Protocol (IoA Task Protocol).
      The protocol's code point allocation will be determined in subsequent
      revisions as the standard matures, in accordance with IANA's relevant
      registration procedures.</t>
    </section>

    <section title="Acknowledgement">
      <t>Thanks Weize Chen, Ziming You, Ran Li, Yitong Guan, Chen Qian,
      Chenyang Zhao, Ruobing Xie, Maosong Sun and Yu Hao for their valuable
      comments on this draft.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include="reference.RFC.9110"?>

      <?rfc include="reference.RFC.8259"?>

      <?rfc include="reference.RFC.9000"?>

      <?rfc include='reference.RFC.7432'?>

      <?rfc include='reference.RFC.6749'?>

      <?rfc include='reference.RFC.8446'?>
    </references>
  </back>
</rfc>
