IMAP4 Multimailbox SEARCH Extension
Huawei Technologies
+1 646 827 0648
barryleiba@computer.org
http://internetmessagingtechnology.org/
Isode Limited
14 Castle Mews
Hampton
Middlesex
TW12 2NP
United Kingdom
Alexey.Melnikov@isode.com
http://www.melnikov.ca/
Applications
Applications Area Working Group
IMAP
email
search
multiple mailboxes
imapext
The IMAP4 specification allows the searching of only the selected
mailbox. A user often wants to search multiple mailboxes, and a
client that wishes to support this must issue a series of SELECT and
SEARCH commands, waiting for each to complete before moving on to the
next.
This extension allows a client to search multiple mailboxes
with one command, limiting the delays caused by many round
trips and not requiring disruption of the currently selected
mailbox.
This extension also uses MAILBOX, UIDVALIDITY, and TAG fields in
ESEARCH responses, allowing a client to pipeline the searches if it
chooses. This document updates RFC 4466 and obsoletes RFC 6237.
The IMAP4 specification allows the searching of only the selected
mailbox. A user often wants to search multiple mailboxes, and a
client that wishes to support this must issue a series of SELECT and
SEARCH commands, waiting for each to complete before moving on to the
next. The commands can't be pipelined, because the server might run
them in parallel and the untagged SEARCH responses could not then
be distinguished from each other.
This extension allows a client to search multiple mailboxes
with one command and includes MAILBOX and TAG fields in the ESEARCH response,
yielding the following advantages:
A single command limits the number of round trips needed to search
a set of mailboxes.
A single command eliminates the need to wait for one search to complete
before starting the next.
A single command allows the server to optimize the search if it can.
A command that is not dependent upon the selected mailbox eliminates the
need to disrupt the selection state or to open another IMAP connection.
The MAILBOX, UIDVALIDITY, and TAG fields in the responses
allow a client to distinguish which responses go with which search
(and which mailbox). A client can safely pipeline these search
commands without danger of confusion. The addition of the
MAILBOX and UIDVALIDITY fields updates the search-correlator
item defined in .
This extension was previously published in an Experimental RFC.
There is now implementation experience, giving confidence in the protocol, so
this document puts the extension on the Standards Track, with some minor updates
that were informed by the implementation experience.
A brief summary of changes is in .
In examples, "C:" indicates lines sent by a client that is connected
to a server, and "S:" indicates lines sent by the server to the client.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document
are to be interpreted as described in .
OPTIONAL source options
OPTIONAL result options
OPTIONAL charset specification (see )
searching criteria (one or more)
REQUIRED untagged response: ESEARCH
OK -- search completed
NO -- error: cannot search that charset or criteria
BAD -- command unknown or arguments invalid
This section defines a new ESEARCH command, which works similarly to
the UID SEARCH command described in
Section 2.6.1 of
(initially described in Section 6.4.4 of
and extended by ).
The ESEARCH command further extends searching by allowing for
optional source and result options.
This document does not define any new result options
(see Section 3.1 of ).
A server that supports this extension includes "MULTISEARCH"
in its IMAP capability string.
Because there has been confusion about this, it is worth pointing out
that with ESEARCH, as with any SEARCH or UID SEARCH command, it MUST NOT
be considered an error if the search terms include a range of message
numbers that extends (or, in fact, starts) beyond the end of the mailbox.
For example, a client might want to establish a rolling window through
the search results this way:
C: tag1 UID ESEARCH FROM "frobozz" 1:100
... followed later by this:
C: tag1 UID ESEARCH FROM "frobozz" 101:200
... and so on.
This tells the server to match only the first hundred messages
in the mailbox the first time, the second hundred the second time, etc.
In fact, it might likely allow the server to optimize the search significantly.
In the above example, whether the mailbox contains 50, 150, or 250 messages,
neither of the search commands shown will result in an error. It is up to
the client to know when to stop moving its search window.
In response to an ESEARCH command,
the server MUST return ESEARCH responses
(that is, not SEARCH responses).
Because message numbers are not useful for mailboxes that are not selected,
the responses MUST contain information about UIDs, not message numbers.
This is true even if the source options specify that only the selected mailbox
be searched.
Presence of a source option in the absence of a result option implies the "ALL" result
option (see Section 3.1 of ). Note that this is not the same
as the result from the SEARCH command described in the IMAP base
protocol .
Source options describe which mailboxes must be searched for messages.
An ESEARCH command with source options does not affect which
mailbox, if any, is currently selected, regardless of which mailboxes are searched.
For each mailbox satisfying the source options, a single
ESEARCH response MUST be returned
if any messages in that mailbox match the search criteria.
An ESEARCH response MUST NOT
be returned for mailboxes that contain no matching messages.
This is true even when result options such as MIN, MAX, and COUNT are specified
(see Section 3.1 of ), and the values returned
(lowest UID matched, highest UID matched, and number of messages matched,
respectively) apply to the mailbox reported in that ESEARCH response.
Note that it is possible for an ESEARCH command to return no untagged responses
(no ESEARCH responses at all) in the case that there are no matches to the search
in any of the mailboxes that satisfy the source options.
Clients can detect this situation by finding the tagged OK response without having
received any matching untagged ESEARCH responses.
Each ESEARCH response MUST contain the MAILBOX, TAG, and UIDVALIDITY correlators.
Correlators allow clients to issue several ESEARCH commands at once (pipelined).
If the SEARCHRES extension is used in an ESEARCH
command, that ESEARCH command MUST be executed by the server after all previous
SEARCH/ESEARCH commands have completed and before any subsequent SEARCH/ESEARCH
commands are executed.
The server MAY perform consecutive ESEARCH commands in parallel
as long as none of them use the SEARCHRES extension.
The source options, if present, MUST contain a mailbox specifier
as defined in the IMAP NOTIFY extension , Section 6
(using the "filter-mailboxes" ABNF item), with the
following differences:
The "selected-delayed" specifier is not valid here.
A "subtree-one" specifier is added. The "subtree" specifier
results in a search of the specified mailbox and all selectable
mailboxes that are subordinate to it, through an indefinitely
deep hierarchy.
The "subtree-one" specifier results in a search of
the specified mailbox and all selectable child mailboxes,
one hierarchy level down.
If "subtree" is specified, the server MUST defend against loops
in the hierarchy (for example, those caused by recursive file-system
links within the message store).
The server SHOULD do this by keeping track of the mailboxes that have
been searched and by terminating the hierarchy traversal when a repeat
is found.
If it cannot do that, it MAY do it by limiting the hierarchy depth.
If the source options are not present, the value "selected" is
assumed -- that is, only the currently selected mailbox is searched.
The "personal" source option is a particularly convenient way to search all of
the current user's mailboxes.
Note that there is no way to use wildcard characters to search all mailboxes;
the "mailboxes" source option does not do wildcard expansion.
If the source options include (or default to) "selected",
the IMAP session MUST be in "selected" state.
If the source options specify other mailboxes and NOT "selected",
then the IMAP session MUST be in either "selected" or "authenticated" state.
If the session is not in a correct state,
the ESEARCH command MUST return a "BAD" result.
The client SHOULD NOT provide source options that resolve to including the same
mailbox more than once. A server can, of course, remove the duplicates before
processing, but the server MAY return "BAD" to an ESEARCH command with duplicate
source mailboxes.
If the server supports the SEARCHRES extension, then
the "SAVE" result option is valid only if "selected" is specified
or defaulted to as the sole mailbox to be searched.
If any source option other than "selected" is specified,
the ESEARCH command MUST return a "BAD" result.
If the server supports the CONTEXT=SEARCH and/or CONTEXT=SORT extension
, then the following additional rules apply:
The CONTEXT return option (Section 4.2 of ) can be used
with an ESEARCH command.
If the UPDATE return option is used (Section 4.3 of ),
it MUST apply only to the currently selected mailbox.
If UPDATE is used and there is no mailbox currently selected,
the ESEARCH command MUST return a "BAD" result.
The PARTIAL search return option (Section 4.4 of )
can be used and applies to each mailbox searched by the ESEARCH command.
If the server supports the Access Control List (ACL) extension, then
the logged-in user is required to have the "r" right for each mailbox
she wants to search.
In addition, any mailboxes that are not explicitly named
(accessed through "personal" or "subtree", for example) are required
to have the "l" right.
Mailboxes matching the source options for which
the logged-in user lacks sufficient rights MUST be ignored
by the ESEARCH command processing.
In particular, ESEARCH responses MUST NOT be returned for those mailboxes.
The base IMAP SEARCH command (Section 6.4.4. of )
requires strict substring matching in text searches. Many servers, however,
use search engines that match strings in different ways, for
example, matching
"swim" to both "swam" and "swum" or only doing full word matching (where
"swim" will not match "swimming").
This is covered by the "Fuzzy Search" extension
to IMAP , and that extension is compatible with this one
and can be combined with it.
Whether or not Fuzzy Search is implemented or used, this extension explicitly allows
flexible searching with respect to TEXT and BODY searches. Servers MAY use fuzzy text
matching in multimailbox searches.
To avoid having a search use more than a reasonable share of server resources,
servers MAY apply limits that go beyond loop protection, such as limits on the
number of mailboxes that may be searched at once and/or limits on the number
or total size of messages searched.
A server can apply those limits up front, responding with "NO [LIMIT]"
if a limit is exceeded
(see for information about response codes).
Alternatively, a server can process the search and terminate it when a limit
is exceeded, responding with "OK [LIMIT]" and returning partial results.
Note that searches that return partial results can cause
complexity for client implementations and confusion to users.
In the following example, note that two ESEARCH commands are pipelined
and that the server is running them in parallel, interleaving a response
to the second search amid the responses to the first (watch the tags).
C: tag1 ESEARCH IN (mailboxes "folder1" subtree "folder2") unseen
C: tag2 ESEARCH IN (mailboxes "folder1" subtree-one "folder2") subject "chad"
S: * ESEARCH (TAG "tag1" MAILBOX "folder1" UIDVALIDITY 1) UID ALL 4001,4003,4005,4007,4009
S: * ESEARCH (TAG "tag2" MAILBOX "folder1" UIDVALIDITY 1) UID ALL 3001:3004,3788
S: * ESEARCH (TAG "tag1" MAILBOX "folder2/banana" UIDVALIDITY 503) UID ALL 3002,4004
S: * ESEARCH (TAG "tag1" MAILBOX "folder2/peach" UIDVALIDITY 3) UID ALL 921691
S: tag1 OK done
S: * ESEARCH (TAG "tag2" MAILBOX "folder2/salmon" UIDVALIDITY 1111111) UID ALL 50003,50006,50009,50012
S: tag2 OK done
The following syntax specification uses the Augmented Backus-Naur Form (ABNF)
as described in .
Terms not defined here are taken from , ,
or .
esearch
; Update definition from IMAP base .
; Add new "esearch" command.
esearch
; Update definition from IMAP base .
; Add new "esearch" command.
("subtree-one" SP one-or-more-mailbox)
; Update definition from IMAP Notify .
; Add new "subtree-one" selector.
"selected"
; Update definition from IMAP Notify .
; We forbid the use of "selected-delayed".
("TAG" SP tag-string) /
("MAILBOX" SP astring) / ("UIDVALIDITY" SP nz-number)
; Each correlator MUST appear exactly once.
scope-option-name [SP scope-option-value]
; No options defined here. Syntax for future extensions.
tagged-ext-label
; No options defined here. Syntax for future extensions.
tagged-ext-val
; No options defined here. Syntax for future extensions.
scope-option *(SP scope-option)
; A given option may only appear once.
; No options defined here. Syntax for future extensions.
"ESEARCH" [SP esearch-source-opts]
[SP search-return-opts] SP search-program
SP "(" one-correlator *(SP one-correlator) ")"
; Updates definition in IMAP4 ABNF .
"IN" SP "(" source-mbox [SP
"(" scope-options ")"] ")"
filter-mailboxes *(SP filter-mailboxes)
; "filter-mailboxes" is defined in IMAP Notify .
; See updated definition of filter-mailboxes-other, above.
; See updated definition of filter-mailboxes-selected, above.
This new IMAP ESEARCH command allows a single command to search many
mailboxes at once.
On the one hand, a client could do that by sending many IMAP SEARCH commands.
On the other hand, this makes it easier for a client to overwork a
server by sending a single command that results in an expensive search
of tens of thousands of mailboxes.
Server implementations need to be aware of that and provide mechanisms
that prevent a client from adversely affecting other users.
Limitations on the number of mailboxes that may be searched in one command and/or
on the server resources that will be devoted to responding to a single client,
are reasonable limitations for an implementation to impose
(see also ).
Implementations MUST, of course, apply access controls appropriately,
limiting a user's access to ESEARCH in the same way its access is
limited for any other IMAP commands.
This extension has no data-access risks beyond what may exist in the unextended
IMAP implementation.
Mailboxes matching the source options for which
the logged-in user lacks sufficient rights MUST be ignored
by the ESEARCH command processing
(see the paragraph about this in ).
In particular,
any attempt to distinguish insufficient access from non-existent mailboxes
may expose information about the mailbox hierarchy that isn't otherwise
available to the client.
If "subtree" is specified, the server MUST defend against loops in the hierarchy
(see the paragraph about this in ).
The "Internet Message Access Protocol (IMAP) Capabilities Registry" is currently located at
<http://www.iana.org/assignments/imap-capabilities>.
IANA has changed the reference for the IMAP capability "MULTISEARCH"
to point to this document.
Change to Standards Track.
Added paragraph about duplicate mailboxes.
Added about Fuzzy Search.
Key words for use in RFCs to Indicate Requirement Levels
Harvard University
1350 Mass. Ave.
Cambridge
MA 02138
- +1 617 495 3864
sob@harvard.edu
General
keyword
In many standards track documents several words are used to signify
the requirements in the specification. These words are often
capitalized. This document defines these words as they should be
interpreted in IETF documents. Authors who follow these guidelines
should incorporate this phrase near the beginning of their document:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
RFC 2119.
Note that the force of these words is modified by the requirement
level of the document in which they are used.
IANA Charset Registration Procedures
Multipurpose Internet Mail Extensions (MIME) and various other Internet protocols are capable of using many different charsets. This in turn means that the ability to label different charsets is essential. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.
INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1
The Internet Message Access Protocol, Version 4rev1 (IMAP4rev1) allows a client to access and manipulate electronic mail messages on a server. IMAP4rev1 permits manipulation of mailboxes (remote message folders) in a way that is functionally equivalent to local folders. IMAP4rev1 also provides the capability for an offline client to resynchronize with the server. IMAP4rev1 includes operations for creating, deleting, and renaming mailboxes, checking for new messages, permanently removing messages, setting and clearing flags, RFC 2822 and RFC 2045 parsing, searching, and selective fetching of message attributes, texts, and portions thereof. Messages in IMAP4rev1 are accessed by the use of numbers. These numbers are either message sequence numbers or unique identifiers. IMAP4rev1 supports a single server. A mechanism for accessing configuration information to support multiple IMAP4rev1 servers is discussed in RFC 2244. IMAP4rev1 does not specify a means of posting mail; this function is handled by a mail transfer protocol such as RFC 2821. [STANDARDS-TRACK]
IMAP4 Access Control List (ACL) Extension
The Access Control List (ACL) extension (RFC 2086) of the Internet Message Access Protocol (IMAP) permits mailbox access control lists to be retrieved and manipulated through the IMAP protocol.</t><t> This document is a revision of RFC 2086. It defines several new access control rights and clarifies which rights are required for different IMAP commands. [STANDARDS-TRACK]
Collected Extensions to IMAP4 ABNF
Over the years, many documents from IMAPEXT and LEMONADE working groups, as well as many individual documents, have added syntactic extensions to many base IMAP commands described in RFC 3501. For ease of reference, this document collects most of such ABNF changes in one place.</t><t> This document also suggests a set of standard patterns for adding options and extensions to several existing IMAP commands defined in RFC 3501. The patterns provide for compatibility between existing and future extensions.</t><t> This document updates ABNF in RFCs 2088, 2342, 3501, 3502, and 3516. It also includes part of the errata to RFC 3501. This document doesn't specify any semantic changes to the listed RFCs. [STANDARDS-TRACK]
IMAP4 Extension to SEARCH Command for Controlling What Kind of Information Is Returned
This document extends IMAP (RFC 3501) SEARCH and UID SEARCH commands with several result options, which can control what kind of information is returned. The following result options are defined: minimal value, maximal value, all found messages, and number of found messages. [STANDARDS-TRACK]
IMAP Extension for Referencing the Last SEARCH Result
Many IMAP clients use the result of a SEARCH command as the input to perform another operation, for example, fetching the found messages, deleting them, or copying them to another mailbox.</t><t> This can be achieved using standard IMAP operations described in RFC 3501; however, this would be suboptimal. The server will send the list of found messages to the client; after that, the client will have to parse the list, reformat it, and send it back to the server. The client can't pipeline the SEARCH command with the subsequent command, and, as a result, the server might not be able to perform some optimizations.</t><t> This document proposes an IMAP extension that allows a client to tell a server to use the result of a SEARCH (or Unique Identifier (UID) SEARCH) command as an input to any subsequent command. [STANDARDS-TRACK]
Augmented BNF for Syntax Specifications: ABNF
Internet technical specifications often need to define a formal syntax. Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications. The current specification documents ABNF. It balances compactness and simplicity with reasonable representational power. The differences between standard BNF and ABNF involve naming rules, repetition, alternatives, order-independence, and value ranges. This specification also supplies additional rule definitions and encoding for a core lexical analyzer of the type common to several Internet specifications. [STANDARDS-TRACK]
Contexts for IMAP4
The IMAP4rev1 protocol has powerful search facilities as part of the core protocol, but lacks the ability to create live, updated results that can be easily handled. This memo provides such an extension, and shows how it can be used to provide a facility similar to virtual mailboxes. [STANDARDS-TRACK]
The IMAP NOTIFY Extension
This document defines an IMAP extension that allows a client to request specific kinds of unsolicited notifications for specified mailboxes, such as messages being added to or deleted from such mailboxes. [STANDARDS-TRACK]
IMAP Response Codes
IMAP responses consist of a response type (OK, NO, BAD), an optional machine-readable response code, and a human-readable text.</t><t> This document collects and documents a variety of machine-readable response codes, for better interoperation and error reporting. [STANDARDS-TRACK]
IMAP4 Extension for Fuzzy Search
This document describes an IMAP protocol extension enabling a server to perform searches with inexact matching and assigning relevancy scores for matched messages. [STANDARDS-TRACK]
The authors gratefully acknowledge feedback provided by Timo Sirainen, Peter Coates,
Arnt Gulbrandsen, and Chris Newman.