💾 Archived View for gmi.noulin.net › rfc › rfc5173.gmi captured on 2022-04-29 at 01:16:43. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2022-01-08)
-=-=-=-=-=-=-
Updates:
Keywords: [--------], search, full text, email
Network Working Group J. Degener Request for Comments: 5173 P. Guenther Updates: 5229 Sendmail, Inc. Category: Standards Track April 2008 Sieve Email Filtering: Body Extension Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Abstract This document defines a new command for the "Sieve" email filtering language that tests for the occurrence of one or more strings in the body of an email message. Degener & Guenther Standards Track [Page 1] RFC 5173 Sieve Email Filtering: Body Extension April 2008 1. Introduction The "body" test checks for the occurrence of one or more strings in the body of an email message. Such a test was initially discussed for the [SIEVE] base document, but was subsequently removed because it was thought to be too costly to implement. Nevertheless, several server vendors have implemented some form of the "body" test. This document reintroduces the "body" test as an extension, and specifies its syntax and semantics. 2. Conventions Used in This Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [KEYWORDS]. Conventions for notations are as in [SIEVE] Section 1.1, including the use of the "Usage:" label for the definition of text and tagged argument syntax. The rules for interpreting the grammar are defined in [SIEVE] and inherited by this specification. In particular, readers of this document are reminded that according to [SIEVE] Sections 2.6.2 and 2.6.3, optional arguments such as COMPARATOR and MATCH-TYPE can appear in any order. 3. Capability Identifier The capability string associated with the extension defined in this document is "body". 4. Test body Usage: "body" [COMPARATOR] [MATCH-TYPE] [BODY-TRANSFORM] <key-list: string-list> The body test matches content in the body of an email message, that is, anything following the first empty line after the header. (The empty line itself, if present, is not considered to be part of the body.) The COMPARATOR and MATCH-TYPE keyword parameters are defined in [SIEVE]. As specified in Sections 2.7.1 and 2.7.3 of [SIEVE], the default COMPARATOR is "i;ascii-casemap" and the default MATCH-TYPE is ":is". Degener & Guenther Standards Track [Page 2] RFC 5173 Sieve Email Filtering: Body Extension April 2008 The BODY-TRANSFORM is a keyword parameter that governs how a set of strings to be matched against are extracted from the body of the message. If a message consists of a header only, not followed by an empty line, then that set is empty and all "body" tests return false, including those that test for an empty string. (This is similar to how the "header" test always fails when the named header fields aren't present.) Otherwise, the transform must be followed as defined below in Section 5. Note that the transformations defined here do *not* match against each line of the message independently, so the strings will usually contain CRLFs. How these can be matched is governed by the comparator and match-type. For example, with the default comparator of "i;ascii-casemap", they can be included literally in the key strings, or be matched with the "*" or "?" wildcards of the :matches match-type, or be skipped with :contains. 5. Body Transform Prior to matching content in a message body, "transformations" can be applied that filter and decode certain parts of the body. These transformations are selected by a "BODY-TRANSFORM" keyword parameter. Usage: ":raw" / ":content" <content-types: string-list> / ":text" The default transformation is :text. 5.1. Body Transform ":raw" The ":raw" transform matches against the entire undecoded body of a message as a single item. If the specified body-transform is ":raw", the [MIME] structure of the body is irrelevant. The implementation MUST NOT remove any transfer encoding from the message, MUST NOT refuse to filter messages with syntactic errors (unless the environment it is part of rejects them outright), and MUST treat multipart boundaries or the MIME headers of enclosed body parts as part of the content being matched against, instead of MIME structures to interpret. Degener & Guenther Standards Track [Page 3] RFC 5173 Sieve Email Filtering: Body Extension April 2008 Example: require "body"; # This will match a message containing the literal text # "MAKE MONEY FAST" in body parts (ignoring any # content-transfer-encodings) or MIME headers other than # the outermost RFC 2822 header. if body :raw :contains "MAKE MONEY FAST" { discard; } 5.2. Body Transform ":content" If the body transform is ":content", the MIME parts that have the specified content types are matched against independently. If an individual content type begins or ends with a '/' (slash) or contains multiple slashes, then it matches no content types. Otherwise, if it contains a slash, then it specifies a full <type>/<subtype> pair, and matches only that specific content type. If it is the empty string, all MIME content types are matched. Otherwise, it specifies a <type> only, and any subtype of that type matches it. The search for MIME parts matching the :content specification is recursive and automatically descends into multipart and message/rfc822 MIME parts. All MIME parts with matching types are searched for the key strings. The test returns true if any combination of a searched MIME part and key-list argument match. If the :content specification matches a multipart MIME part, only the prologue and epilogue sections of the part will be searched for the key strings, treating the entire prologue and the entire epilogue as separate strings; the contents of nested parts are only searched if their respective types match the :content specification. If the :content specification matches a message/rfc822 MIME part, only the header of the nested message will be searched for the key strings, treating the header as a single string; the contents of the nested message body parts are only searched if their content type matches the :content specification. For other MIME types, the entire part will be searched as a single string. Degener & Guenther Standards Track [Page 4] RFC 5173 Sieve Email Filtering: Body Extension April 2008 (Matches against container types with an empty match string can be useful as tests for the existence of such parts.) Example: From: Whomever To: Someone Date: Whenever Subject: whatever Content-Type: multipart/mixed; boundary=outer & This is a multi-part message in MIME format. & --outer Content-Type: multipart/alternative; boundary=inner & This is a nested multi-part message in MIME format. & --inner Content-Type: text/plain; charset="us-ascii" $ Hello $ --inner Content-Type: text/html; charset="us-ascii" % <html><body>Hello</body></html> % --inner-- & & This is the end of the inner MIME multipart. & --outer Content-Type: message/rfc822 ! From: Someone Else ! Subject: hello request $ Please say Hello $ --outer-- & & This is the end of the outer MIME multipart. Degener & Guenther Standards Track [Page 5] RFC 5173 Sieve Email Filtering: Body Extension April 2008 In the above example, the '&', '