💾 Archived View for gmi.noulin.net › man › man7 › regex.7.gmi captured on 2024-06-16 at 14:16:55. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2022-06-12)
-=-=-=-=-=-=-
REGEX(7) Linux Programmer's Manual REGEX(7) NAME regex - POSIX.2 regular expressions DESCRIPTION Regular expressions ("RE"s), as defined in POSIX.2, come in two forms: modern REs (roughly those of egrep; POSIX.2 calls these "extended" REs) and obsolete REs (roughly those of ed(1); POSIX.2 "basic" REs). Obsolete REs mostly exist for backward compatibility in some old programs; they will be discussed at the end. POSIX.2 leaves some aspects of RE syntax and semantics open; "(!)" marks decisions on these aspects that may not be fully portable to other POSIX.2 implementa‐ tions. A (modern) RE is one(!) or more nonempty(!) branches, separated by '|'. It matches anything that matches one of the branches. A branch is one(!) or more pieces, concatenated. It matches a match for the first, followed by a match for the second, and so on. A piece is an atom possibly followed by a single(!) '*', '+', '?', or bound. An atom followed by '*' matches a sequence of 0 or more matches of the atom. An atom followed by '+' matches a sequence of 1 or more matches of the atom. An atom followed by '?' matches a sequence of 0 or 1 matches of the atom. A bound is '{' followed by an unsigned decimal integer, possibly followed by ',' possibly followed by another unsigned decimal integer, always followed by '}'. The integers must lie between 0 and RE_DUP_MAX (255(!)) inclusive, and if there are two of them, the first may not exceed the second. An atom followed by a bound containing one integer i and no comma matches a sequence of exactly i matches of the atom. An atom followed by a bound containing one integer i and a comma matches a sequence of i or more matches of the atom. An atom followed by a bound containing two integers i and j matches a sequence of i through j (inclu‐ sive) matches of the atom. An atom is a regular expression enclosed in "()" (matching a match for the regular expression), an empty set of "()" (matching the null string)(!), a bracket ex‐ pression (see below), '.' (matching any single character), '^' (matching the null string at the beginning of a line), '