<?xml version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>

<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="std"
  docName="draft-eggert-uidbatches-00"
  ipr="trust200902"
  obsoletes=""
  updates=""
  submissionType="IETF"
  xml:lang="en"
  version="3">
  <front>
    <title>IMAP UIDBATCHES Extension</title>
    <seriesInfo name="Internet-Draft" value="draft-eggert-uidbatches-00"/>
    <author fullname="Daniel Eggert" initials="D" role="editor" surname="Eggert">
      <organization>Apple Inc</organization>
      <address>
        <postal>
          <street>One Apple Park Way</street>
          <city>Cupertino</city>
          <region>CA</region>
          <code>95014</code>
          <country>USA</country>
        </postal>
        <email>deggert@apple.com</email>
      </address>
    </author>
   
    <date year="2024" month="11" day="4"/>

    <area>Art</area>
    <workgroup>MailMaint</workgroup>
    <keyword>IMAP</keyword>

    <abstract>
      <t>The UIDBATCHES extension of the Internet Message Access Protocol (IMAP) allows clients to retrieve UIDs from the server such that these UIDs split the messages of a mailbox into equally sized batches. This lets the client perform operations such as FETCH/SEARCH/STORE on these specific batches. This limits the number of messages that each command operates on and may limit the size of the response.</t>
    </abstract>
   </front>

  <middle>
    
    <section>
      <name>Introduction</name>
      <t>This document defines an extension to the Internet Message Access Protocol <xref target="RFC3501"/> for retrieving UIDs that split a mailbox's messages into euqally sized batches.  This extension is compatible with both IMAP4rev1 <xref target="RFC3501"/> and IMAP4rev2 <xref target="RFC9051"/>.</t>
    </section>

  <section>
      <name>Document Conventions</name>
        
        <t>In protocol examples, this document uses a prefix of "C: " to denote lines sent by the client to the server, and "S: " for lines sent by the server to the client.  Lines prefixed with "// " are comments explaining the previous protocol line.  These prefixes and comments are not part of the protocol.  Lines without any of these prefixes are continuations of the previous line, and no line break is present in the protocol unless specifically mentioned.</t>
        <t>The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>",
          "<bcp14>SHALL NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT
          RECOMMENDED</bcp14>", "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be
          interpreted as described in BCP 14 <xref target="RFC2119"/>
          <xref target="RFC8174"/> when, and only when, they appear in
          all capitals, as shown here.</t>
        <t>Other capitalised words are IMAP keywords <xref target="RFC3501"/> or keywords from this document.</t>
    </section>

    <section>
      <name>The UIDBATCHES extension</name>
      <t>An IMAP server advertises support for the UIDBATCHES extension by including the UIDBATCHES capability in the CAPABILITY response / response code.</t>
      <section>
        <name>UIDBATCHES Command</name>
        
        <dl newline="true">
          <dt>Arguments:</dt>
          <dd>Message count per batch.<br/>Optional batch range.</dd>
          <dt>Responses:</dt>
          <dd>UIDBATCHES response</dd>
          <dt>Result:</dt>
          <dd>OK<br/>BAD command unknown or arguments invalid</dd>
        </dl>
        
        <t>When the client sends a UIDBATCHES command to the server, the server will return those UIDs that split the messages in the currently selected mailbox into equal sizes.</t>
        <t>For a mailbox with &lt;N&gt; messages, requesting batches of size &lt;M&gt; will return the UIDs of the messages with the sequence numbers</t>
        <sourcecode>
          <![CDATA[
<N-M>, <N-2*M>, <N-3*M>, ...
          ]]>
        </sourcecode>
        
        <t>If e.g. the client sends</t>
        <sourcecode>
          <![CDATA[
C: A302 UIDBATCHES 2000
          ]]>
        </sourcecode>
        <t>and the currently selected mailbox has 6823 messages, the server might
        return</t>
        <sourcecode>
          <![CDATA[
S: * UIDBATCHES (TAG "A302") UID ALL 99696 20351 7830
S: A302 OK UIDBATCHES Completed
          ]]>
        </sourcecode>
        <t>where the message with sequence number 4823 has UID 99696, the message with sequence number 2823 has UID 20351, and the message with sequence number 823 has UID 7830.</t>
        <t>The server <bcp14>MUST</bcp14> reply with an UIDBATCHES response. The UIDBATCHES response has the same format as an ESEARCH untagged response to a UID SEARCH command with <tt>RETURN ()</tt>. It notably <bcp14>MUST</bcp14> include the tag of the command it relates to, and it <bcp14>MUST</bcp14> include the UID indicator.</t>
        <t>Note that this will not return the highest UID since the client is expected to know this value from UIDNEXT or any newly received messages. Similarly, this will not return the UID of the message with sequence number 1.</t>
        <t>A client can optionally provide a batch range. The server will then limits its response to the given UIDs at the given indices. E.g. if the client sends</t>
        <sourcecode>
          <![CDATA[
C: A302 UIDBATCHES 2000 100:200
          ]]>
        </sourcecode>
        <t>for a mailbox with more than 400,000 messages, the server would return the 100th to 200th batch, corresponding to the 200,000th and 400,000th message respectively. These UIDs would still split the mailboxes messages into batches of size 2000. The first returned UID would thus correspond to the sequence number 200,000, the second returned UID would correspond to the sequence number 202,000.</t>
        <t>As such, the client can request the first 4 batches with</t>
        <sourcecode>
          <![CDATA[
C: A302 UIDBATCHES 2000 1:4
          ]]>
        </sourcecode>
        <t>If the selected mailbox has more than 8,000 messages, the server would then return an UIDBATCHES response with 4 UIDs.</t>
        <t>When the client issues any valid UIDBATCHES command and the mailbox is empty, the server <bcp14>MUST</bcp14> reply with an UIDBATCHES response. This UIDBATCHES response will not have an <tt>ALL</tt> part, similar to a UID SEARCH that doesn't match any messages, e.g.</t>
        <sourcecode>
          <![CDATA[
S: * UIDBATCHES (TAG "A302") UID
S: A302 OK UIDBATCHES Completed
          ]]>
        </sourcecode>

        <t>The server <bcp14>MAY</bcp14> return fewer UIDs than requested by the client even if the mailbox contains more messages. Since the client knows what the count of messages in the mailbox is, it can determine if the server returned all UIDs or not. Servers <bcp14>MUST</bcp14> at least return the first 40 results unless the client requested fewer. Servers <bcp14>SHOULD</bcp14> at least return the first 100 results unless the client requested fewer.</t>
        <t>Servers <bcp14>MUST</bcp14> respond with <tt>BAD</tt> and a response code <tt>TOO SMALL</tt> if the client uses a batch size that is smaller than the minimum allowed by the server, e.g.</t>
        <sourcecode>
          <![CDATA[
S: A302 BAD [TOO SMALL] Minimum batch size is 500
          ]]>
        </sourcecode>
        <t>Servers <bcp14>MUST</bcp14> allow for batch sizes 500 or larger.</t>
      </section>
      <section>
        <name>Intended Use</name>
        
        <t>When the client selects a mailbox, it can use the UIDBATCHES command to find the UIDs that split the mailboxes messages into batches. E.g.</t>
        <sourcecode>
          <![CDATA[
C: A142 SELECT INBOX
S: * 6823 EXISTS
S: * 1 RECENT
S: * OK [UNSEEN 12] Message 12 is first unseen
S: * OK [UIDVALIDITY 3857529045] UIDs valid
S: * OK [UIDNEXT 215296] Predicted next UID
S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
S: * OK [PERMANENTFLAGS (\Deleted \Seen \*)] Limited
S: A142 OK [READ-WRITE] SELECT completed
C: A143 UIDBATCHES 2000
S: * UIDBATCHES (TAG "A143") UID ALL 99696 20351 7830
S: A143 OK UIDBATCHES Completed
          ]]>
        </sourcecode>
        <t>The client can then use these 4 UID ranges:</t>
        <ol type="%d." group="reqs">
            <li>215295:99695</li>
            <li>99696:20350</li>
            <li>20351:7829</li>
            <li>7830:1</li>
        </ol>
        <t>where each range has 2,000 messages in it, except for the last range, which only holds the remaining 823 messages.</t>
        <t>Since new messages can not appear within these UID ranges, the number of messages in each range can not grow. It may decrease, though, as messages get deleted.</t>
        <t>The client may choose to keep track of the number of EXPUNGE or VANISH messages and re-run UIDBATCHES when many messages have been deleted. The client <bcp14>MUST NOT</bcp14> excessively re-run UIDBATCHES and specifically <bcp14>MUST NOT</bcp14> re-run UIDBATCHES unless at least N/2 messages have been deleted from the mailbox, where N is the batch size the client has requested.</t>
        <t>Similarly, once new messages arrive into the mailbox, the client can start a new message batch 215296:*. Once N or more new messages have arrived, the client can then create a second new batch based on the UID of the N'th message. Alternatively, the client may choose to re-run UIDBATCHES. The client <bcp14>MUST NOT</bcp14> re-run UIDBATCHES if fewer than N/2 new messages have been received.</t>
        <t>To clarify, the client <bcp14>MUST NOT</bcp14> re-run UIDBATCHES unless at least one of these conditions are met:</t>
        <ol type="%d." group="reqs">
          <li>a new mailbox has been selected</li>
          <li>more than N/2 messages have been expunged from the mailbox</li>
          <li>more than N/2 new messages have been received into the mailbox</li>
        </ol>
      </section>
      <section>
        <name>Similarity to UID SEARCH Command</name>
        <t>The UIDBATCHES is in effect nothing more than shorthand for a UID SEARCH command of the form</t>
        <sourcecode>
          <![CDATA[
C: A145 UID SEARCH RETURN () <N-M>,<N-2*M>,<N-3*M>,...
          ]]>
        </sourcecode>
        <t>where N is the number of messages in the mailbox and M is the
        requested batch count.</t>
        <t>The special purpose UIDBATCHES command, though, tries to
        address two problems:</t>
        <ol type="(%c)">
          <li>for many servers, UID SEARCH commands specifying sequence numbers are costly, especially for mailboxes with many messages.</li>
          <li>the UIDONLY extension disallows the use of sequence numbers and thus makes it difficult for the client to split its commands into batches of a size that works well for the client and server.</li>
        </ol>
        <t>By providing a special purpose command, servers can implement a different, optimized code path for determining message batches. And servers using the UIDONLY extension can provide a facility to let the client determine message batches without using sequence numbers in a UID SEARCH command.</t>
      </section>
      <section>
        <name>Similarity to PARTIAL Extension</name>
        
        <t>The PARTIAL extension provides a different way for the client to split its commands into batches by using pages SEARCH and FETCH.</t>
        <t>The intention of the UIDBATCHES command is the let the client pre-determine message batches of a desired size.</t>
        <t>This makes it easier for the client to share implementation between servers regardless of their support of PARTIAL. And additionally, because the client can issue a corresponding UID SEARCH command to servers that do not implement UIDBATCHES, the client can use similar batching implementations for servers that support UIDBATCHES and those that do not.</t>
      </section>
      <section>
        <name>Interaction with MESSAGELIMIT Extension</name>
        
        <t>When the server supports both the MESSAGELIMIT and UIDBATCHES extension, the client <bcp14>SHOULD</bcp14> request batches no larger than the specified maximum number of messages that can be processed in a single command. The client <bcp14>MAY</bcp14> choose to use a smaller batch size.</t>
        <t>Additionally, since servers <bcp14>MAY</bcp14> limit the number of UIDs returned in response to UIDBATCHES, it is reasonable to assume that they would at most return N UIDs where N is the limit the server announced as its MESSAGELIMIT.</t>
      </section>
      <section>
        <name>Interaction with UIDONLY Extension</name>
        <t>As noted above, the UIDBATCHES extension allows for clients to create UID ranges for message batches even when the connection operates in UIDONLY mode which otherwise doesn't allow for using message sequence numbers.</t>
      </section>
      <section>
        <name>Interaction with SEARCHRES Extension</name>
        
        <t>Servers that support SEARCHRES <xref target="RFC5182"/> <bcp14>MUST NOT</bcp14> store the result of UIDBATCHES in the <tt>$</tt> variable.</t>
      </section>
    </section>
      <section>
        <name>Formal syntax</name>
        
        <t>The following syntax specification uses the Augmented Backus-Naur Form (ABNF) notation as specified in <xref target="RFC5234"/>.</t>
        <t>Non-terminals referenced but not defined below are as defined by
        IMAP4 <xref target="RFC3501"/>.</t>
        <t>Except as noted otherwise, all alphabetic characters are case-insensitive.  The use of upper or lower case characters to define token strings is for editorial clarity only.  Implementations <bcp14>MUST</bcp14> accept these strings in a case-insensitive fashion.</t>
        <sourcecode>
          <![CDATA[
capability          =/ "UIDBATCHES"
                       ;; <capability> from [RFC3501]

command-select      =/ message-batches

message-batches     = "UIDBATCHES" SP nz-number
                      [SP nz-number ":" nz-number]

uidbatches-response = "UIDBATCHES" search-correlator SP "UID"
                      [ALL tagged-ext-simple]

mailbox-data        =/ uidbatches-response
          ]]>
        </sourcecode>
      </section>
      <section>
        <name>Security Considerations</name>
        <t>This document defines an additional IMAP4 capability.  As such, it does not change the underlying security considerations of <xref target="RFC3501"/> and IMAP4rev2 <xref target="RFC9051"/>.</t>
        <t>This document defines an optimization that can both reduce the amount of work performed by the server, as well at the amount of data returned to the client.  Use of this extension is likely to cause the server and the client to use less memory than when the extension is not used.  However, as this is going to be new code in both the client and the server, rigorous testing of such code is required in order to avoid introducing of new implementation bugs.</t>
      </section>
    <section>
      <name>IANA Considerations</name>
      <section>
        <name>Changes/additions to the IMAP4 capabilities registry</name>
        <t>IMAP4 capabilities are registered by publishing a standards track or IESG approved Informational or Experimental RFC. The registry is currently located at:</t>
        <t>https://www.iana.org/assignments/imap4-capabilities</t>
        <t>IANA is requested to add registrations of the "UIDBATCHES" capability to this registry, pointing to this document.</t>
      </section>
    </section>

  </middle>

  <back>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3501.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9051.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5234.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5182.xml"/>
      </references>
     </references>
 </back>
</rfc>
