-\1f
-File: xemacs.info, Node: Regexps, Next: Search Case, Prev: Regexp Search, Up: Search
-
-Syntax of Regular Expressions
-=============================
-
- Regular expressions have a syntax in which a few characters are
-special constructs and the rest are "ordinary". An ordinary character
-is a simple regular expression that matches that character and nothing
-else. The special characters are `.', `*', `+', `?', `[', `]', `^',
-`$', and `\'; no new special characters will be defined in the future.
-Any other character appearing in a regular expression is ordinary,
-unless a `\' precedes it.
-
- For example, `f' is not a special character, so it is ordinary, and
-therefore `f' is a regular expression that matches the string `f' and
-no other string. (It does _not_ match the string `ff'.) Likewise, `o'
-is a regular expression that matches only `o'.
-
- Any two regular expressions A and B can be concatenated. The result
-is a regular expression that matches a string if A matches some amount
-of the beginning of that string and B matches the rest of the string.
-
- As a simple example, we can concatenate the regular expressions `f'
-and `o' to get the regular expression `fo', which matches only the
-string `fo'. Still trivial. To do something more powerful, you need
-to use one of the special characters. Here is a list of them:
-
-`. (Period)'
- is a special character that matches any single character except a
- newline. Using concatenation, we can make regular expressions
- like `a.b', which matches any three-character string that begins
- with `a' and ends with `b'.
-
-`*'
- is not a construct by itself; it is a quantifying suffix operator
- that means to repeat the preceding regular expression as many
- times as possible. In `fo*', the `*' applies to the `o', so `fo*'
- matches one `f' followed by any number of `o's. The case of zero
- `o's is allowed: `fo*' does match `f'.
-
- `*' always applies to the _smallest_ possible preceding
- expression. Thus, `fo*' has a repeating `o', not a repeating `fo'.
-
- The matcher processes a `*' construct by matching, immediately, as
- many repetitions as can be found; it is "greedy". Then it
- continues with the rest of the pattern. If that fails,
- backtracking occurs, discarding some of the matches of the
- `*'-modified construct in case that makes it possible to match the
- rest of the pattern. For example, in matching `ca*ar' against the
- string `caaar', the `a*' first tries to match all three `a's; but
- the rest of the pattern is `ar' and there is only `r' left to
- match, so this try fails. The next alternative is for `a*' to
- match only two `a's. With this choice, the rest of the regexp
- matches successfully.
-
- Nested repetition operators can be extremely slow if they specify
- backtracking loops. For example, it could take hours for the
- regular expression `\(x+y*\)*a' to match the sequence
- `xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxz'. The slowness is because
- Emacs must try each imaginable way of grouping the 35 `x''s before
- concluding that none of them can work. To make sure your regular
- expressions run fast, check nested repetitions carefully.
-
-`+'
- is a quantifying suffix operator similar to `*' except that the
- preceding expression must match at least once. It is also
- "greedy". So, for example, `ca+r' matches the strings `car' and
- `caaaar' but not the string `cr', whereas `ca*r' matches all three
- strings.
-
-`?'
- is a quantifying suffix operator similar to `*', except that the
- preceding expression can match either once or not at all. For
- example, `ca?r' matches `car' or `cr', but does not match anything
- else.
-
-`*?'
- works just like `*', except that rather than matching the longest
- match, it matches the shortest match. `*?' is known as a
- "non-greedy" quantifier, a regexp construct borrowed from Perl.
-
- This construct is very useful for when you want to match the text
- inside a pair of delimiters. For instance, `/\*.*?\*/' will match
- C comments in a string. This could not easily be achieved without
- the use of a non-greedy quantifier.
-
- This construct has not been available prior to XEmacs 20.4. It is
- not available in FSF Emacs.
-
-`+?'
- is the non-greedy version of `+'.
-
-`??'
- is the non-greedy version of `?'.
-
-`\{n,m\}'
- serves as an interval quantifier, analogous to `*' or `+', but
- specifies that the expression must match at least N times, but no
- more than M times. This syntax is supported by most Unix regexp
- utilities, and has been introduced to XEmacs for the version 20.3.
-
- Unfortunately, the non-greedy version of this quantifier does not
- exist currently, although it does in Perl.
-
-`[ ... ]'
- `[' begins a "character set", which is terminated by a `]'. In
- the simplest case, the characters between the two brackets form
- the set. Thus, `[ad]' matches either one `a' or one `d', and
- `[ad]*' matches any string composed of just `a's and `d's
- (including the empty string), from which it follows that `c[ad]*r'
- matches `cr', `car', `cdr', `caddaar', etc.
-
- The usual regular expression special characters are not special
- inside a character set. A completely different set of special
- characters exists inside character sets: `]', `-' and `^'.
-
- `-' is used for ranges of characters. To write a range, write two
- characters with a `-' between them. Thus, `[a-z]' matches any
- lower case letter. Ranges may be intermixed freely with individual
- characters, as in `[a-z$%.]', which matches any lower case letter
- or `$', `%', or a period.
-
- To include a `]' in a character set, make it the first character.
- For example, `[]a]' matches `]' or `a'. To include a `-', write
- `-' as the first character in the set, or put it immediately after
- a range. (You can replace one individual character C with the
- range `C-C' to make a place to put the `-'.) There is no way to
- write a set containing just `-' and `]'.
-
- To include `^' in a set, put it anywhere but at the beginning of
- the set.
-
-`[^ ... ]'
- `[^' begins a "complement character set", which matches any
- character except the ones specified. Thus, `[^a-z0-9A-Z]' matches
- all characters _except_ letters and digits.
-
- `^' is not special in a character set unless it is the first
- character. The character following the `^' is treated as if it
- were first (thus, `-' and `]' are not special there).
-
- Note that a complement character set can match a newline, unless
- newline is mentioned as one of the characters not to match.
-
-`^'
- is a special character that matches the empty string, but only at
- the beginning of a line in the text being matched. Otherwise it
- fails to match anything. Thus, `^foo' matches a `foo' that occurs
- at the beginning of a line.
-
- When matching a string instead of a buffer, `^' matches at the
- beginning of the string or after a newline character `\n'.
-
-`$'
- is similar to `^' but matches only at the end of a line. Thus,
- `x+$' matches a string of one `x' or more at the end of a line.
-
- When matching a string instead of a buffer, `$' matches at the end
- of the string or before a newline character `\n'.
-
-`\'
- has two functions: it quotes the special characters (including
- `\'), and it introduces additional special constructs.
-
- Because `\' quotes special characters, `\$' is a regular
- expression that matches only `$', and `\[' is a regular expression
- that matches only `[', and so on.
-
- *Please note:* For historical compatibility, special characters are
-treated as ordinary ones if they are in contexts where their special
-meanings make no sense. For example, `*foo' treats `*' as ordinary
-since there is no preceding expression on which the `*' can act. It is
-poor practice to depend on this behavior; quote the special character
-anyway, regardless of where it appears.
-
- For the most part, `\' followed by any character matches only that
-character. However, there are several exceptions: characters that,
-when preceded by `\', are special constructs. Such characters are
-always ordinary when encountered on their own. Here is a table of `\'
-constructs:
-
-`\|'
- specifies an alternative. Two regular expressions A and B with
- `\|' in between form an expression that matches anything that
- either A or B matches.
-
- Thus, `foo\|bar' matches either `foo' or `bar' but no other string.
-
- `\|' applies to the largest possible surrounding expressions.
- Only a surrounding `\( ... \)' grouping can limit the grouping
- power of `\|'.
-
- Full backtracking capability exists to handle multiple uses of
- `\|'.
-
-`\( ... \)'
- is a grouping construct that serves three purposes:
-
- 1. To enclose a set of `\|' alternatives for other operations.
- Thus, `\(foo\|bar\)x' matches either `foox' or `barx'.
-
- 2. To enclose an expression for a suffix operator such as `*' to
- act on. Thus, `ba\(na\)*' matches `bananana', etc., with any
- (zero or more) number of `na' strings.
-
- 3. To record a matched substring for future reference.
-
- This last application is not a consequence of the idea of a
- parenthetical grouping; it is a separate feature that happens to be
- assigned as a second meaning to the same `\( ... \)' construct
- because there is no conflict in practice between the two meanings.
- Here is an explanation of this feature:
-
-`\DIGIT'
- matches the same text that matched the DIGITth occurrence of a `\(
- ... \)' construct.
-
- In other words, after the end of a `\( ... \)' construct. the
- matcher remembers the beginning and end of the text matched by that
- construct. Then, later on in the regular expression, you can use
- `\' followed by DIGIT to match that same text, whatever it may
- have been.
-
- The strings matching the first nine `\( ... \)' constructs
- appearing in a regular expression are assigned numbers 1 through 9
- in the order that the open parentheses appear in the regular
- expression. So you can use `\1' through `\9' to refer to the text
- matched by the corresponding `\( ... \)' constructs.
-
- For example, `\(.*\)\1' matches any newline-free string that is
- composed of two identical halves. The `\(.*\)' matches the first
- half, which may be anything, but the `\1' that follows must match
- the same exact text.
-
-`\(?: ... \)'
- is called a "shy" grouping operator, and it is used just like `\(
- ... \)', except that it does not cause the matched substring to be
- recorded for future reference.
-
- This is useful when you need a lot of grouping `\( ... \)'
- constructs, but only want to remember one or two - or if you have
- more than nine groupings and need to use backreferences to refer to
- the groupings at the end.
-
- Using `\(?: ... \)' rather than `\( ... \)' when you don't need
- the captured substrings ought to speed up your programs some,
- since it shortens the code path followed by the regular expression
- engine, as well as the amount of memory allocation and string
- copying it must do. The actual performance gain to be observed
- has not been measured or quantified as of this writing.
-
- The shy grouping operator has been borrowed from Perl, and has not
- been available prior to XEmacs 20.3, nor is it available in FSF
- Emacs.
-
-`\w'
- matches any word-constituent character. The editor syntax table
- determines which characters these are. *Note Syntax::.
-
-`\W'
- matches any character that is not a word constituent.
-
-`\sCODE'
- matches any character whose syntax is CODE. Here CODE is a
- character that represents a syntax code: thus, `w' for word
- constituent, `-' for whitespace, `(' for open parenthesis, etc.
- *Note Syntax::, for a list of syntax codes and the characters that
- stand for them.
-
-`\SCODE'
- matches any character whose syntax is not CODE.
-
- The following regular expression constructs match the empty
-string--that is, they don't use up any characters--but whether they
-match depends on the context.
-
-`\`'
- matches the empty string, but only at the beginning of the buffer
- or string being matched against.
-
-`\''
- matches the empty string, but only at the end of the buffer or
- string being matched against.
-
-`\='
- matches the empty string, but only at point. (This construct is
- not defined when matching against a string.)
-
-`\b'
- matches the empty string, but only at the beginning or end of a
- word. Thus, `\bfoo\b' matches any occurrence of `foo' as a
- separate word. `\bballs?\b' matches `ball' or `balls' as a
- separate word.
-
-`\B'
- matches the empty string, but _not_ at the beginning or end of a
- word.
-
-`\<'
- matches the empty string, but only at the beginning of a word.
-
-`\>'
- matches the empty string, but only at the end of a word.
-
- Here is a complicated regexp used by Emacs to recognize the end of a
-sentence together with any whitespace that follows. It is given in Lisp
-syntax to enable you to distinguish the spaces from the tab characters.
-In Lisp syntax, the string constant begins and ends with a
-double-quote. `\"' stands for a double-quote as part of the regexp,
-`\\' for a backslash as part of the regexp, `\t' for a tab and `\n' for
-a newline.
-
- "[.?!][]\"')]*\\($\\|\t\\| \\)[ \t\n]*"
-
-This regexp contains four parts: a character set matching period, `?'
-or `!'; a character set matching close-brackets, quotes or parentheses,
-repeated any number of times; an alternative in backslash-parentheses
-that matches end-of-line, a tab or two spaces; and a character set
-matching whitespace characters, repeated any number of times.
-
-\1f
-File: xemacs.info, Node: Search Case, Next: Replace, Prev: Regexps, Up: Search
-
-Searching and Case
-==================
-
- All searches in Emacs normally ignore the case of the text they are
-searching through; if you specify searching for `FOO', `Foo' and `foo'
-are also considered a match. Regexps, and in particular character
-sets, are included: `[aB]' matches `a' or `A' or `b' or `B'.
-
- If you want a case-sensitive search, set the variable
-`case-fold-search' to `nil'. Then all letters must match exactly,
-including case. `case-fold-search' is a per-buffer variable; altering
-it affects only the current buffer, but there is a default value which
-you can change as well. *Note Locals::. You can also use Case
-Sensitive Search from the Options menu on your screen.
-
-\1f
-File: xemacs.info, Node: Replace, Next: Other Repeating Search, Prev: Search Case, Up: Search
-
-Replacement Commands
-====================
-
- Global search-and-replace operations are not needed as often in
-Emacs as they are in other editors, but they are available. In
-addition to the simple `replace-string' command which is like that
-found in most editors, there is a `query-replace' command which asks
-you, for each occurrence of a pattern, whether to replace it.
-
- The replace commands all replace one string (or regexp) with one
-replacement string. It is possible to perform several replacements in
-parallel using the command `expand-region-abbrevs'. *Note Expanding
-Abbrevs::.
-
-* Menu:
-
-* Unconditional Replace:: Replacing all matches for a string.
-* Regexp Replace:: Replacing all matches for a regexp.
-* Replacement and Case:: How replacements preserve case of letters.
-* Query Replace:: How to use querying.
-
-\1f
-File: xemacs.info, Node: Unconditional Replace, Next: Regexp Replace, Prev: Replace, Up: Replace
-
-Unconditional Replacement
--------------------------
-
-`M-x replace-string <RET> STRING <RET> NEWSTRING <RET>'
- Replace every occurrence of STRING with NEWSTRING.
-
-`M-x replace-regexp <RET> REGEXP <RET> NEWSTRING <RET>'
- Replace every match for REGEXP with NEWSTRING.
-
- To replace every instance of `foo' after point with `bar', use the
-command `M-x replace-string' with the two arguments `foo' and `bar'.
-Replacement occurs only after point: if you want to cover the whole
-buffer you must go to the beginning first. By default, all occurrences
-up to the end of the buffer are replaced. To limit replacement to part
-of the buffer, narrow to that part of the buffer before doing the
-replacement (*note Narrowing::).
-
- When `replace-string' exits, point is left at the last occurrence
-replaced. The value of point when the `replace-string' command was
-issued is remembered on the mark ring; `C-u C-<SPC>' moves back there.
-
- A numeric argument restricts replacement to matches that are
-surrounded by word boundaries.
-
-\1f
-File: xemacs.info, Node: Regexp Replace, Next: Replacement and Case, Prev: Unconditional Replace, Up: Replace
-
-Regexp Replacement
-------------------
-
- `replace-string' replaces exact matches for a single string. The
-similar command `replace-regexp' replaces any match for a specified
-pattern.
-
- In `replace-regexp', the NEWSTRING need not be constant. It can
-refer to all or part of what is matched by the REGEXP. `\&' in
-NEWSTRING stands for the entire text being replaced. `\D' in
-NEWSTRING, where D is a digit, stands for whatever matched the D'th
-parenthesized grouping in REGEXP. For example,
-
- M-x replace-regexp <RET> c[ad]+r <RET> \&-safe <RET>
-
-would replace (for example) `cadr' with `cadr-safe' and `cddr' with
-`cddr-safe'.
-
- M-x replace-regexp <RET> \(c[ad]+r\)-safe <RET> \1 <RET>
-
-would perform exactly the opposite replacements. To include a `\' in
-the text to replace with, you must give `\\'.
-
-\1f
-File: xemacs.info, Node: Replacement and Case, Next: Query Replace, Prev: Regexp Replace, Up: Replace
-
-Replace Commands and Case
--------------------------
-
- If the arguments to a replace command are in lower case, the command
-preserves case when it makes a replacement. Thus, the following
-command:
-
- M-x replace-string <RET> foo <RET> bar <RET>
-
-replaces a lower-case `foo' with a lower case `bar', `FOO' with `BAR',
-and `Foo' with `Bar'. If upper-case letters are used in the second
-argument, they remain upper-case every time that argument is inserted.
-If upper-case letters are used in the first argument, the second
-argument is always substituted exactly as given, with no case
-conversion. Likewise, if the variable `case-replace' is set to `nil',
-replacement is done without case conversion. If `case-fold-search' is
-set to `nil', case is significant in matching occurrences of `foo' to
-replace; also, case conversion of the replacement string is not done.
-
-\1f
-File: xemacs.info, Node: Query Replace, Prev: Replacement and Case, Up: Replace
-
-Query Replace
--------------
-
-`M-% STRING <RET> NEWSTRING <RET>'
-`M-x query-replace <RET> STRING <RET> NEWSTRING <RET>'
- Replace some occurrences of STRING with NEWSTRING.
-
-`M-x query-replace-regexp <RET> REGEXP <RET> NEWSTRING <RET>'
- Replace some matches for REGEXP with NEWSTRING.
-
- If you want to change only some of the occurrences of `foo' to
-`bar', not all of them, you can use `query-replace' instead of `M-%'.
-This command finds occurrences of `foo' one by one, displays each
-occurrence, and asks you whether to replace it. A numeric argument to
-`query-replace' tells it to consider only occurrences that are bounded
-by word-delimiter characters.
-
- Aside from querying, `query-replace' works just like
-`replace-string', and `query-replace-regexp' works just like
-`replace-regexp'.
-
- The things you can type when you are shown an occurrence of STRING
-or a match for REGEXP are:
-
-`<SPC>'
- to replace the occurrence with NEWSTRING. This preserves case,
- just like `replace-string', provided `case-replace' is non-`nil',
- as it normally is.
-
-`<DEL>'
- to skip to the next occurrence without replacing this one.
-
-`, (Comma)'
- to replace this occurrence and display the result. You are then
- prompted for another input character. However, since the
- replacement has already been made, <DEL> and <SPC> are equivalent.
- At this point, you can type `C-r' (see below) to alter the
- replaced text. To undo the replacement, you can type `C-x u'.
- This exits the `query-replace'. If you want to do further
- replacement you must use `C-x ESC' to restart (*note Repetition::).
-
-`<ESC>'
- to exit without doing any more replacements.
-
-`. (Period)'
- to replace this occurrence and then exit.
-
-`!'
- to replace all remaining occurrences without asking again.
-
-`^'
- to go back to the location of the previous occurrence (or what
- used to be an occurrence), in case you changed it by mistake.
- This works by popping the mark ring. Only one `^' in a row is
- allowed, because only one previous replacement location is kept
- during `query-replace'.
-
-`C-r'
- to enter a recursive editing level, in case the occurrence needs
- to be edited rather than just replaced with NEWSTRING. When you
- are done, exit the recursive editing level with `C-M-c' and the
- next occurrence will be displayed. *Note Recursive Edit::.
-
-`C-w'
- to delete the occurrence, and then enter a recursive editing level
- as in `C-r'. Use the recursive edit to insert text to replace the
- deleted occurrence of STRING. When done, exit the recursive
- editing level with `C-M-c' and the next occurrence will be
- displayed.
-
-`C-l'
- to redisplay the screen and then give another answer.
-
-`C-h'
- to display a message summarizing these options, then give another
- answer.
-
- If you type any other character, Emacs exits the `query-replace', and
-executes the character as a command. To restart the `query-replace',
-use `C-x <ESC>', which repeats the `query-replace' because it used the
-minibuffer to read its arguments. *Note C-x ESC: Repetition.
-
-\1f
-File: xemacs.info, Node: Other Repeating Search, Prev: Replace, Up: Search
-
-Other Search-and-Loop Commands
-==============================
-
- Here are some other commands that find matches for a regular
-expression. They all operate from point to the end of the buffer.
-
-`M-x occur'
- Print each line that follows point and contains a match for the
- specified regexp. A numeric argument specifies the number of
- context lines to print before and after each matching line; the
- default is none.
-
- The buffer `*Occur*' containing the output serves as a menu for
- finding occurrences in their original context. Find an occurrence
- as listed in `*Occur*', position point there, and type `C-c C-c';
- this switches to the buffer that was searched and moves point to
- the original of the same occurrence.
-
-`M-x list-matching-lines'
- Synonym for `M-x occur'.
-
-`M-x count-matches'
- Print the number of matches following point for the specified
- regexp.
-
-`M-x delete-non-matching-lines'
- Delete each line that follows point and does not contain a match
- for the specified regexp.
-
-`M-x delete-matching-lines'
- Delete each line that follows point and contains a match for the
- specified regexp.
-
-\1f
-File: xemacs.info, Node: Fixit, Next: Files, Prev: Search, Up: Top
-
-Commands for Fixing Typos
-*************************
-
- This chapter describes commands that are especially useful when you
-catch a mistake in your text just after you have made it, or when you
-change your mind while composing text on line.
-
-* Menu:
-
-* Kill Errors:: Commands to kill a batch of recently entered text.
-* Transpose:: Exchanging two characters, words, lines, lists...
-* Fixing Case:: Correcting case of last word entered.
-* Spelling:: Apply spelling checker to a word, or a whole file.
-
-\1f
-File: xemacs.info, Node: Kill Errors, Next: Transpose, Prev: Fixit, Up: Fixit
-
-Killing Your Mistakes
-=====================
-
-`<DEL>'
- Delete last character (`delete-backward-char').
-
-`M-<DEL>'
- Kill last word (`backward-kill-word').
-
-`C-x <DEL>'
- Kill to beginning of sentence (`backward-kill-sentence').
-
- The <DEL> character (`delete-backward-char') is the most important
-correction command. When used among graphic (self-inserting)
-characters, it can be thought of as canceling the last character typed.
-
- When your mistake is longer than a couple of characters, it might be
-more convenient to use `M-<DEL>' or `C-x <DEL>'. `M-<DEL>' kills back
-to the start of the last word, and `C-x <DEL>' kills back to the start
-of the last sentence. `C-x <DEL>' is particularly useful when you are
-thinking of what to write as you type it, in case you change your mind
-about phrasing. `M-<DEL>' and `C-x <DEL>' save the killed text for
-`C-y' and `M-y' to retrieve. *Note Yanking::.
-
- `M-<DEL>' is often useful even when you have typed only a few
-characters wrong, if you know you are confused in your typing and aren't
-sure exactly what you typed. At such a time, you cannot correct with
-<DEL> except by looking at the screen to see what you did. It requires
-less thought to kill the whole word and start over.
-