git.chise.org Git - chise/xemacs-chise.git.1/blob - man/xemacs/search.texi

   1
   2 @node Search, Fixit, Display, Top
   3 @chapter Searching and Replacement
   4 @cindex searching
   5
   6   Like other editors, Emacs has commands for searching for occurrences of
   7 a string.  The principal search command is unusual in that it is
   8 @dfn{incremental}: it begins to search before you have finished typing the
   9 search string.  There are also non-incremental search commands more like
  10 those of other editors.
  11
  12   Besides the usual @code{replace-string} command that finds all
  13 occurrences of one string and replaces them with another, Emacs has a fancy
  14 replacement command called @code{query-replace} which asks interactively
  15 which occurrences to replace.
  16
  17 @menu
  18 * Incremental Search::     Search happens as you type the string.
  19 * Non-Incremental Search:: Specify entire string and then search.
  20 * Word Search::            Search for sequence of words.
  21 * Regexp Search::          Search for match for a regexp.
  22 * Regexps::                Syntax of regular expressions.
  23 * Search Case::            To ignore case while searching, or not.
  24 * Replace::                Search, and replace some or all matches.
  25 * Other Repeating Search:: Operating on all matches for some regexp.
  26 @end menu
  27
  28 @node Incremental Search, Non-Incremental Search, Search, Search
  29 @section Incremental Search
  30
  31   An incremental search begins searching as soon as you type the first
  32 character of the search string.  As you type in the search string, Emacs
  33 shows you where the string (as you have typed it so far) is found.
  34 When you have typed enough characters to identify the place you want, you
  35 can stop.  Depending on what you do next, you may or may not need to
  36 terminate the search explicitly with a @key{RET}.
  37
  38 @c WideCommands
  39 @table @kbd
  40 @item C-s
  41 Incremental search forward (@code{isearch-forward}).
  42 @item C-r
  43 Incremental search backward (@code{isearch-backward}).
  44 @end table
  45
  46 @kindex C-s
  47 @kindex C-r
  48 @findex isearch-forward
  49 @findex isearch-backward
  50   @kbd{C-s} starts an incremental search.  @kbd{C-s} reads characters from
  51 the keyboard and positions the cursor at the first occurrence of the
  52 characters that you have typed.  If you type @kbd{C-s} and then @kbd{F},
  53 the cursor moves right after the first @samp{F}.  Type an @kbd{O}, and see
  54 the cursor move to after the first @samp{FO}.  After another @kbd{O}, the
  55 cursor is after the first @samp{FOO} after the place where you started the
  56 search.  Meanwhile, the search string @samp{FOO} has been echoed in the
  57 echo area.@refill
  58
  59   The echo area display ends with three dots when actual searching is going
  60 on.  When search is waiting for more input, the three dots are removed.
  61 (On slow terminals, the three dots are not displayed.)
  62
  63   If you make a mistake in typing the search string, you can erase
  64 characters with @key{DEL}.  Each @key{DEL} cancels the last character of the
  65 search string.  This does not happen until Emacs is ready to read another
  66 input character; first it must either find, or fail to find, the character
  67 you want to erase.  If you do not want to wait for this to happen, use
  68 @kbd{C-g} as described below.@refill
  69
  70   When you are satisfied with the place you have reached, you can type
  71 @key{RET} (or @key{C-m}), which stops searching, leaving the cursor where
  72 the search brought it.  Any command not specially meaningful in searches also
  73 stops the search and is then executed.  Thus, typing @kbd{C-a} exits the
  74 search and then moves to the beginning of the line.  @key{RET} is necessary
  75 only if the next command you want to type is a printing character,
  76 @key{DEL}, @key{ESC}, or another control character that is special
  77 within searches (@kbd{C-q}, @kbd{C-w}, @kbd{C-r}, @kbd{C-s}, or @kbd{C-y}).
  78
  79   Sometimes you search for @samp{FOO} and find it, but were actually
  80 looking for a different occurrence of it.  To move to the next occurrence
  81 of the search string, type another @kbd{C-s}.  Do this as often as
  82 necessary.  If you overshoot, you can cancel some @kbd{C-s}
  83 characters with @key{DEL}.
  84
  85   After you exit a search, you can search for the same string again by
  86 typing just @kbd{C-s C-s}: the first @kbd{C-s} is the key that invokes
  87 incremental search, and the second @kbd{C-s} means ``search again''.
  88
  89   If the specified string is not found at all, the echo area displays
  90 the text @samp{Failing I-Search}.  The cursor is after the place where
  91 Emacs found as much of your string as it could.  Thus, if you search for
  92 @samp{FOOT}, and there is no @samp{FOOT}, the cursor may be after the
  93 @samp{FOO} in @samp{FOOL}.  At this point there are several things you
  94 can do.  If you mistyped the search string, correct it.  If you like the
  95 place you have found, you can type @key{RET} or some other Emacs command
  96 to ``accept what the search offered''.  Or you can type @kbd{C-g}, which
  97 removes from the search string the characters that could not be found
  98 (the @samp{T} in @samp{FOOT}), leaving those that were found (the
  99 @samp{FOO} in @samp{FOOT}).  A second @kbd{C-g} at that point cancels
 100 the search entirely, returning point to where it was when the search
 101 started.
 102
 103   If a search is failing and you ask to repeat it by typing another
 104 @kbd{C-s}, it starts again from the beginning of the buffer.  Repeating
 105 a failing backward search with @kbd{C-r} starts again from the end.  This
 106 is called @dfn{wrapping around}.  @samp{Wrapped} appears in the search
 107 prompt once this has happened.
 108
 109 @cindex quitting (in search)
 110   The @kbd{C-g} ``quit'' character does special things during searches;
 111 just what it does depends on the status of the search.  If the search has
 112 found what you specified and is waiting for input, @kbd{C-g} cancels the
 113 entire search.  The cursor moves back to where you started the search.  If
 114 @kbd{C-g} is typed when there are characters in the search string that have
 115 not been found---because Emacs is still searching for them, or because it
 116 has failed to find them---then the search string characters which have not
 117 been found are discarded from the search string.  The
 118 search is now successful and waiting for more input, so a second @kbd{C-g}
 119 cancels the entire search.
 120
 121   To search for a control character such as @kbd{C-s} or @key{DEL} or
 122 @key{ESC}, you must quote it by typing @kbd{C-q} first.  This function
 123 of @kbd{C-q} is analogous to its meaning as an Emacs command: it causes
 124 the following character to be treated the way a graphic character would
 125 normally be treated in the same context.
 126
 127  To search backwards, you can use @kbd{C-r} instead of @kbd{C-s} to
 128 start the search; @kbd{C-r} is the key that runs the command
 129 (@code{isearch-backward}) to search backward.  You can also use
 130 @kbd{C-r} to change from searching forward to searching backwards.  Do
 131 this if a search fails because the place you started was too far down in the
 132 file.  Repeated @kbd{C-r} keeps looking for more occurrences backwards.
 133 @kbd{C-s} starts going forward again.  You can cancel @kbd{C-r} in a
 134 search with @key{DEL}.
 135
 136   The characters @kbd{C-y} and @kbd{C-w} can be used in incremental search
 137 to grab text from the buffer into the search string.  This makes it
 138 convenient to search for another occurrence of text at point.  @kbd{C-w}
 139 copies the word after point as part of the search string, advancing
 140 point over that word.  Another @kbd{C-s} to repeat the search will then
 141 search for a string including that word.  @kbd{C-y} is similar to @kbd{C-w}
 142 but copies the rest of the current line into the search string.
 143
 144   The characters @kbd{M-p} and @kbd{M-n} can be used in an incremental
 145 search to recall things which you have searched for in the past.  A
 146 list of the last 16 things you have searched for is retained, and
 147 @kbd{M-p} and @kbd{M-n} let you cycle through that ring.
 148
 149 The character @kbd{M-@key{TAB}} does completion on the elements in
 150 the search history ring.  For example, if you know that you have
 151 recently searched for the string @code{POTATOE}, you could type
 152 @kbd{C-s P O M-@key{TAB}}.  If you had searched for other strings
 153 beginning with @code{PO} then you would be shown a list of them, and
 154 would need to type more to select one.
 155
 156   You can change any of the special characters in incremental search via
 157 the normal keybinding mechanism: simply add a binding to the
 158 @code{isearch-mode-map}.  For example, to make the character
 159 @kbd{C-b} mean ``search backwards'' while in isearch-mode, do this:
 160
 161 @example
 162 (define-key isearch-mode-map "\C-b" 'isearch-repeat-backward)
 163 @end example
 164
 165 These are the default bindings of isearch-mode:
 166
 167 @findex isearch-delete-char
 168 @findex isearch-exit
 169 @findex isearch-quote-char
 170 @findex isearch-repeat-forward
 171 @findex isearch-repeat-backward
 172 @findex isearch-yank-line
 173 @findex isearch-yank-word
 174 @findex isearch-abort
 175 @findex isearch-ring-retreat
 176 @findex isearch-ring-advance
 177 @findex isearch-complete
 178
 179 @kindex DEL (isearch-mode)
 180 @kindex RET (isearch-mode)
 181 @kindex C-q (isearch-mode)
 182 @kindex C-s (isearch-mode)
 183 @kindex C-r (isearch-mode)
 184 @kindex C-y (isearch-mode)
 185 @kindex C-w (isearch-mode)
 186 @kindex C-g (isearch-mode)
 187 @kindex M-p (isearch-mode)
 188 @kindex M-n (isearch-mode)
 189 @kindex M-TAB (isearch-mode)
 190
 191 @table @kbd
 192 @item DEL
 193 Delete a character from the incremental search string (@code{isearch-delete-char}).
 194 @item RET
 195 Exit incremental search (@code{isearch-exit}).
 196 @item C-q
 197 Quote special characters for incremental search (@code{isearch-quote-char}).
 198 @item C-s
 199 Repeat incremental search forward (@code{isearch-repeat-forward}).
 200 @item C-r
 201 Repeat incremental search backward (@code{isearch-repeat-backward}).
 202 @item C-y
 203 Pull rest of line from buffer into search string (@code{isearch-yank-line}).
 204 @item C-w
 205 Pull next word from buffer into search string (@code{isearch-yank-word}).
 206 @item C-g
 207 Cancels input back to what has been found successfully, or aborts the
 208 isearch (@code{isearch-abort}).
 209 @item M-p
 210 Recall the previous element in the isearch history ring
 211 (@code{isearch-ring-retreat}).
 212 @item M-n
 213 Recall the next element in the isearch history ring
 214 (@code{isearch-ring-advance}).
 215 @item M-@key{TAB}
 216 Do completion on the elements in the isearch history ring
 217 (@code{isearch-complete}).
 218
 219 @end table
 220
 221 Any other character which is normally inserted into a buffer when typed
 222 is automatically added to the search string in isearch-mode.
 223
 224 @subsection Slow Terminal Incremental Search
 225
 226   Incremental search on a slow terminal uses a modified style of display
 227 that is designed to take less time.  Instead of redisplaying the buffer at
 228 each place the search gets to, it creates a new single-line window and uses
 229 that to display the line the search has found.  The single-line window
 230 appears as soon as point gets outside of the text that is already
 231 on the screen.
 232
 233   When the search is terminated, the single-line window is removed.  Only
 234 at this time the window in which the search was done is redisplayed to show
 235 its new value of point.
 236
 237   The three dots at the end of the search string, normally used to indicate
 238 that searching is going on, are not displayed in slow style display.
 239
 240 @vindex search-slow-speed
 241   The slow terminal style of display is used when the terminal baud rate is
 242 less than or equal to the value of the variable @code{search-slow-speed},
 243 initially 1200.
 244
 245 @vindex search-slow-window-lines
 246   The number of lines to use in slow terminal search display is controlled
 247 by the variable @code{search-slow-window-lines}.  Its normal value is 1.
 248
 249 @node Non-Incremental Search, Word Search, Incremental Search, Search
 250 @section Non-Incremental Search
 251 @cindex non-incremental search
 252
 253   Emacs also has conventional non-incremental search commands, which require
 254 you type the entire search string before searching begins.
 255
 256 @table @kbd
 257 @item C-s @key{RET} @var{string} @key{RET}
 258 Search for @var{string}.
 259 @item C-r @key{RET} @var{string} @key{RET}
 260 Search backward for @var{string}.
 261 @end table
 262
 263   To do a non-incremental search, first type @kbd{C-s @key{RET}}
 264 (or @kbd{C-s C-m}).  This enters the minibuffer to read the search string.
 265 Terminate the string with @key{RET} to start the search.  If the string
 266 is not found, the search command gets an error.
 267
 268  By default, @kbd{C-s} invokes incremental search, but if you give it an
 269 empty argument, which would otherwise be useless, it invokes non-incremental
 270 search.  Therefore, @kbd{C-s @key{RET}} invokes non-incremental search.
 271 @kbd{C-r @key{RET}} also works this way.
 272
 273 @findex search-forward
 274 @findex search-backward
 275   Forward and backward non-incremental searches are implemented by the
 276 commands @code{search-forward} and @code{search-backward}.  You can bind
 277 these commands to keys.  The reason that incremental
 278 search is programmed to invoke them as well is that @kbd{C-s @key{RET}}
 279 is the traditional sequence of characters used in Emacs to invoke
 280 non-incremental search.
 281
 282  Non-incremental searches performed using @kbd{C-s @key{RET}} do
 283 not call @code{search-forward} right away.  They first check
 284 if the next character is @kbd{C-w}, which requests a word search.
 285 @ifinfo
 286 @xref{Word Search}.
 287 @end ifinfo
 288
 289 @node Word Search, Regexp Search, Non-Incremental Search, Search
 290 @section Word Search
 291 @cindex word search
 292
 293   Word search looks for a sequence of words without regard to how the
 294 words are separated.  More precisely, you type a string of many words,
 295 using single spaces to separate them, and the string is found even if
 296 there are multiple spaces, newlines or other punctuation between the words.
 297
 298   Word search is useful in editing documents formatted by text formatters.
 299 If you edit while looking at the printed, formatted version, you can't tell
 300 where the line breaks are in the source file.  Word search, allows you
 301 to search  without having to know the line breaks.
 302
 303 @table @kbd
 304 @item C-s @key{RET} C-w @var{words} @key{RET}
 305 Search for @var{words}, ignoring differences in punctuation.
 306 @item C-r @key{RET} C-w @var{words} @key{RET}
 307 Search backward for @var{words}, ignoring differences in punctuation.
 308 @end table
 309
 310   Word search is a special case of non-incremental search.  It is invoked
 311 with @kbd{C-s @key{RET} C-w} followed by the search string, which
 312 must always be terminated with another @key{RET}.  Being non-incremental, this
 313 search does not start until the argument is terminated.  It works by
 314 constructing a regular expression and searching for that.  @xref{Regexp
 315 Search}.
 316
 317  You can do a backward word search with @kbd{C-r @key{RET} C-w}.
 318
 319 @findex word-search-forward
 320 @findex word-search-backward
 321   Forward and backward word searches are implemented by the commands
 322 @code{word-search-forward} and @code{word-search-backward}.  You can
 323 bind these commands to keys.  The reason that incremental
 324 search is programmed to invoke them as well is that @kbd{C-s @key{RET} C-w}
 325 is the traditional Emacs sequence of keys for word search.
 326
 327 @node Regexp Search, Regexps, Word Search, Search
 328 @section Regular Expression Search
 329 @cindex regular expression
 330 @cindex regexp
 331
 332   A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
 333 denotes a (possibly infinite) set of strings.  Searching for matches
 334 for a regexp is a powerful operation that editors on Unix systems have
 335 traditionally offered.
 336
 337  To gain a thorough understanding of regular expressions and how to use
 338 them to best advantage, we recommend that you study @cite{Mastering
 339 Regular Expressions, by Jeffrey E.F. Friedl, O'Reilly and Associates,
 340 1997}. (It's known as the "Hip Owls" book, because of the picture on its
 341 cover.)  You might also read the manuals to @ref{(gawk)Top},
 342 @ref{(ed)Top}, @cite{sed}, @cite{grep}, @ref{(perl)Top},
 343 @ref{(regex)Top}, @ref{(rx)Top}, @cite{pcre}, and @ref{(flex)Top}, which
 344 also make good use of regular expressions.
 345
 346  The XEmacs regular expression syntax most closely resembles that of
 347 @cite{ed}, or @cite{grep}, the GNU versions of which all utilize the GNU
 348 @cite{regex} library.  XEmacs' version of @cite{regex} has recently been
 349 extended with some Perl--like capabilities, described in the next
 350 section.
 351
 352  In XEmacs, you can search for the next match for a regexp either
 353 incrementally or not.
 354
 355 @kindex M-C-s
 356 @kindex M-C-r
 357 @findex isearch-forward-regexp
 358 @findex isearch-backward-regexp
 359   Incremental search for a regexp is done by typing @kbd{M-C-s}
 360 (@code{isearch-forward-regexp}).  This command reads a search string
 361 incrementally just like @kbd{C-s}, but it treats the search string as a
 362 regexp rather than looking for an exact match against the text in the
 363 buffer.  Each time you add text to the search string, you make the regexp
 364 longer, and the new regexp is searched for.  A reverse regexp search command
 365 @code{isearch-backward-regexp} also exists, bound to @kbd{M-C-r}.
 366
 367   All of the control characters that do special things within an ordinary
 368 incremental search have the same functionality in incremental regexp search.
 369 Typing @kbd{C-s} or @kbd{C-r} immediately after starting a search
 370 retrieves the last incremental search regexp used:
 371 incremental regexp and non-regexp searches have independent defaults.
 372
 373 @findex re-search-forward
 374 @findex re-search-backward
 375   Non-incremental search for a regexp is done by the functions
 376 @code{re-search-forward} and @code{re-search-backward}.  You can invoke
 377 them with @kbd{M-x} or bind them to keys.  You can also call
 378 @code{re-search-forward} by way of incremental regexp search with
 379 @kbd{M-C-s @key{RET}}; similarly for @code{re-search-backward} with
 380 @kbd{M-C-r @key{RET}}.
 381
 382 @node Regexps, Search Case, Regexp Search, Search
 383 @section Syntax of Regular Expressions
 384
 385   Regular expressions have a syntax in which a few characters are
 386 special constructs and the rest are @dfn{ordinary}.  An ordinary
 387 character is a simple regular expression that matches that character and
 388 nothing else.  The special characters are @samp{.}, @samp{*}, @samp{+},
 389 @samp{?}, @samp{[}, @samp{]}, @samp{^}, @samp{$}, and @samp{\}; no new
 390 special characters will be defined in the future.  Any other character
 391 appearing in a regular expression is ordinary, unless a @samp{\}
 392 precedes it.
 393
 394 For example, @samp{f} is not a special character, so it is ordinary, and
 395 therefore @samp{f} is a regular expression that matches the string
 396 @samp{f} and no other string.  (It does @emph{not} match the string
 397 @samp{ff}.)  Likewise, @samp{o} is a regular expression that matches
 398 only @samp{o}.@refill
 399
 400 Any two regular expressions @var{a} and @var{b} can be concatenated.  The
 401 result is a regular expression that matches a string if @var{a} matches
 402 some amount of the beginning of that string and @var{b} matches the rest of
 403 the string.@refill
 404
 405 As a simple example, we can concatenate the regular expressions @samp{f}
 406 and @samp{o} to get the regular expression @samp{fo}, which matches only
 407 the string @samp{fo}.  Still trivial.  To do something more powerful, you
 408 need to use one of the special characters.  Here is a list of them:
 409
 410 @need 1200
 411 @table @kbd
 412 @item .@: @r{(Period)}
 413 @cindex @samp{.} in regexp
 414 is a special character that matches any single character except a newline.
 415 Using concatenation, we can make regular expressions like @samp{a.b}, which
 416 matches any three-character string that begins with @samp{a} and ends with
 417 @samp{b}.@refill
 418
 419 @item *
 420 @cindex @samp{*} in regexp
 421 is not a construct by itself; it is a quantifying suffix operator that
 422 means to repeat the preceding regular expression as many times as
 423 possible.  In @samp{fo*}, the @samp{*} applies to the @samp{o}, so
 424 @samp{fo*} matches one @samp{f} followed by any number of @samp{o}s.
 425 The case of zero @samp{o}s is allowed: @samp{fo*} does match
 426 @samp{f}.@refill
 427
 428 @samp{*} always applies to the @emph{smallest} possible preceding
 429 expression.  Thus, @samp{fo*} has a repeating @samp{o}, not a
 430 repeating @samp{fo}.@refill
 431
 432 The matcher processes a @samp{*} construct by matching, immediately, as
 433 many repetitions as can be found; it is "greedy".  Then it continues
 434 with the rest of the pattern.  If that fails, backtracking occurs,
 435 discarding some of the matches of the @samp{*}-modified construct in
 436 case that makes it possible to match the rest of the pattern.  For
 437 example, in matching @samp{ca*ar} against the string @samp{caaar}, the
 438 @samp{a*} first tries to match all three @samp{a}s; but the rest of the
 439 pattern is @samp{ar} and there is only @samp{r} left to match, so this
 440 try fails.  The next alternative is for @samp{a*} to match only two
 441 @samp{a}s.  With this choice, the rest of the regexp matches
 442 successfully.@refill
 443
 444 Nested repetition operators can be extremely slow if they specify
 445 backtracking loops.  For example, it could take hours for the regular
 446 expression @samp{\(x+y*\)*a} to match the sequence
 447 @samp{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxz}.  The slowness is because
 448 Emacs must try each imaginable way of grouping the 35 @samp{x}'s before
 449 concluding that none of them can work.  To make sure your regular
 450 expressions run fast, check nested repetitions carefully.
 451
 452 @item +
 453 @cindex @samp{+} in regexp
 454 is a quantifying suffix operator similar to @samp{*} except that the
 455 preceding expression must match at least once.  It is also "greedy".
 456 So, for example, @samp{ca+r} matches the strings @samp{car} and
 457 @samp{caaaar} but not the string @samp{cr}, whereas @samp{ca*r} matches
 458 all three strings.
 459
 460 @item ?
 461 @cindex @samp{?} in regexp
 462 is a quantifying suffix operator similar to @samp{*}, except that the
 463 preceding expression can match either once or not at all.  For example,
 464 @samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anything
 465 else.
 466
 467 @item *?
 468 @cindex @samp{*?} in regexp
 469 works just like @samp{*}, except that rather than matching the longest
 470 match, it matches the shortest match.  @samp{*?} is known as a
 471 @dfn{non-greedy} quantifier, a regexp construct borrowed from Perl.
 472 @c Did perl get this from somewhere?  What's the real history of *? ?
 473
 474 This construct is very useful for when you want to match the text inside
 475 a pair of delimiters.  For instance, @samp{/\*.*?\*/} will match C
 476 comments in a string.  This could not easily be achieved without the use
 477 of a non-greedy quantifier.
 478
 479 This construct has not been available prior to XEmacs 20.4.  It is not
 480 available in FSF Emacs.
 481
 482 @item +?
 483 @cindex @samp{+?} in regexp
 484 is the non-greedy version of @samp{+}.
 485
 486 @item ??
 487 @cindex @samp{??} in regexp
 488 is the non-greedy version of @samp{?}.
 489
 490 @item \@{n,m\@}
 491 @c Note the spacing after the close brace is deliberate.
 492 @cindex @samp{\@{n,m\@} }in regexp
 493 serves as an interval quantifier, analogous to @samp{*} or @samp{+}, but
 494 specifies that the expression must match at least @var{n} times, but no
 495 more than @var{m} times.  This syntax is supported by most Unix regexp
 496 utilities, and has been introduced to XEmacs for the version 20.3.
 497
 498 Unfortunately, the non-greedy version of this quantifier does not exist
 499 currently, although it does in Perl.
 500
 501 @item [ @dots{} ]
 502 @cindex character set (in regexp)
 503 @cindex @samp{[} in regexp
 504 @cindex @samp{]} in regexp
 505 @samp{[} begins a @dfn{character set}, which is terminated by a
 506 @samp{]}.  In the simplest case, the characters between the two brackets
 507 form the set.  Thus, @samp{[ad]} matches either one @samp{a} or one
 508 @samp{d}, and @samp{[ad]*} matches any string composed of just @samp{a}s
 509 and @samp{d}s (including the empty string), from which it follows that
 510 @samp{c[ad]*r} matches @samp{cr}, @samp{car}, @samp{cdr},
 511 @samp{caddaar}, etc.@refill
 512
 513 The usual regular expression special characters are not special inside a
 514 character set.  A completely different set of special characters exists
 515 inside character sets: @samp{]}, @samp{-} and @samp{^}.@refill
 516
 517 @samp{-} is used for ranges of characters.  To write a range, write two
 518 characters with a @samp{-} between them.  Thus, @samp{[a-z]} matches any
 519 lower case letter.  Ranges may be intermixed freely with individual
 520 characters, as in @samp{[a-z$%.]}, which matches any lower case letter
 521 or @samp{$}, @samp{%}, or a period.@refill
 522
 523 To include a @samp{]} in a character set, make it the first character.
 524 For example, @samp{[]a]} matches @samp{]} or @samp{a}.  To include a
 525 @samp{-}, write @samp{-} as the first character in the set, or put it
 526 immediately after a range.  (You can replace one individual character
 527 @var{c} with the range @samp{@var{c}-@var{c}} to make a place to put the
 528 @samp{-}.)  There is no way to write a set containing just @samp{-} and
 529 @samp{]}.
 530
 531 To include @samp{^} in a set, put it anywhere but at the beginning of
 532 the set.
 533
 534 @item [^ @dots{} ]
 535 @cindex @samp{^} in regexp
 536 @samp{[^} begins a @dfn{complement character set}, which matches any
 537 character except the ones specified.  Thus, @samp{[^a-z0-9A-Z]}
 538 matches all characters @emph{except} letters and digits.@refill
 539
 540 @samp{^} is not special in a character set unless it is the first
 541 character.  The character following the @samp{^} is treated as if it
 542 were first (thus, @samp{-} and @samp{]} are not special there).
 543
 544 Note that a complement character set can match a newline, unless
 545 newline is mentioned as one of the characters not to match.
 546
 547 @item ^
 548 @cindex @samp{^} in regexp
 549 @cindex beginning of line in regexp
 550 is a special character that matches the empty string, but only at the
 551 beginning of a line in the text being matched.  Otherwise it fails to
 552 match anything.  Thus, @samp{^foo} matches a @samp{foo} that occurs at
 553 the beginning of a line.
 554
 555 When matching a string instead of a buffer, @samp{^} matches at the
 556 beginning of the string or after a newline character @samp{\n}.
 557
 558 @item $
 559 @cindex @samp{$} in regexp
 560 is similar to @samp{^} but matches only at the end of a line.  Thus,
 561 @samp{x+$} matches a string of one @samp{x} or more at the end of a line.
 562
 563 When matching a string instead of a buffer, @samp{$} matches at the end
 564 of the string or before a newline character @samp{\n}.
 565
 566 @item \
 567 @cindex @samp{\} in regexp
 568 has two functions: it quotes the special characters (including
 569 @samp{\}), and it introduces additional special constructs.
 570
 571 Because @samp{\} quotes special characters, @samp{\$} is a regular
 572 expression that matches only @samp{$}, and @samp{\[} is a regular
 573 expression that matches only @samp{[}, and so on.
 574
 575 @c Removed a paragraph here in lispref about doubling backslashes inside
 576 @c of Lisp strings.
 577
 578 @end table
 579
 580 @strong{Please note:} For historical compatibility, special characters
 581 are treated as ordinary ones if they are in contexts where their special
 582 meanings make no sense.  For example, @samp{*foo} treats @samp{*} as
 583 ordinary since there is no preceding expression on which the @samp{*}
 584 can act.  It is poor practice to depend on this behavior; quote the
 585 special character anyway, regardless of where it appears.@refill
 586
 587 For the most part, @samp{\} followed by any character matches only
 588 that character.  However, there are several exceptions: characters
 589 that, when preceded by @samp{\}, are special constructs.  Such
 590 characters are always ordinary when encountered on their own.  Here
 591 is a table of @samp{\} constructs:
 592
 593 @table @kbd
 594 @item \|
 595 @cindex @samp{|} in regexp
 596 @cindex regexp alternative
 597 specifies an alternative.
 598 Two regular expressions @var{a} and @var{b} with @samp{\|} in
 599 between form an expression that matches anything that either @var{a} or
 600 @var{b} matches.@refill
 601
 602 Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
 603 but no other string.@refill
 604
 605 @samp{\|} applies to the largest possible surrounding expressions.  Only a
 606 surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
 607 @samp{\|}.@refill
 608
 609 Full backtracking capability exists to handle multiple uses of @samp{\|}.
 610
 611 @item \( @dots{} \)
 612 @cindex @samp{(} in regexp
 613 @cindex @samp{)} in regexp
 614 @cindex regexp grouping
 615 is a grouping construct that serves three purposes:
 616
 617 @enumerate
 618 @item
 619 To enclose a set of @samp{\|} alternatives for other operations.
 620 Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.
 621
 622 @item
 623 To enclose an expression for a suffix operator such as @samp{*} to act
 624 on.  Thus, @samp{ba\(na\)*} matches @samp{bananana}, etc., with any
 625 (zero or more) number of @samp{na} strings.@refill
 626
 627 @item
 628 To record a matched substring for future reference.
 629 @end enumerate
 630
 631 This last application is not a consequence of the idea of a
 632 parenthetical grouping; it is a separate feature that happens to be
 633 assigned as a second meaning to the same @samp{\( @dots{} \)} construct
 634 because there is no conflict in practice between the two meanings.
 635 Here is an explanation of this feature:
 636
 637 @item \@var{digit}
 638 matches the same text that matched the @var{digit}th occurrence of a
 639 @samp{\( @dots{} \)} construct.
 640
 641 In other words, after the end of a @samp{\( @dots{} \)} construct.  the
 642 matcher remembers the beginning and end of the text matched by that
 643 construct.  Then, later on in the regular expression, you can use
 644 @samp{\} followed by @var{digit} to match that same text, whatever it
 645 may have been.
 646
 647 The strings matching the first nine @samp{\( @dots{} \)} constructs
 648 appearing in a regular expression are assigned numbers 1 through 9 in
 649 the order that the open parentheses appear in the regular expression.
 650 So you can use @samp{\1} through @samp{\9} to refer to the text matched
 651 by the corresponding @samp{\( @dots{} \)} constructs.
 652
 653 For example, @samp{\(.*\)\1} matches any newline-free string that is
 654 composed of two identical halves.  The @samp{\(.*\)} matches the first
 655 half, which may be anything, but the @samp{\1} that follows must match
 656 the same exact text.
 657
 658 @item \(?: @dots{} \)
 659 @cindex @samp{\(?:} in regexp
 660 @cindex regexp grouping
 661 is called a @dfn{shy} grouping operator, and it is used just like
 662 @samp{\( @dots{} \)}, except that it does not cause the matched
 663 substring to be recorded for future reference.
 664
 665 This is useful when you need a lot of grouping @samp{\( @dots{} \)}
 666 constructs, but only want to remember one or two -- or if you have
 667 more than nine groupings and need to use backreferences to refer to
 668 the groupings at the end.
 669
 670 Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} when you
 671 don't need the captured substrings ought to speed up your programs some,
 672 since it shortens the code path followed by the regular expression
 673 engine, as well as the amount of memory allocation and string copying it
 674 must do.  The actual performance gain to be observed has not been
 675 measured or quantified as of this writing.
 676 @c This is used to good advantage by the font-locking code, and by
 677 @c `regexp-opt.el'.
 678
 679 The shy grouping operator has been borrowed from Perl, and has not been
 680 available prior to XEmacs 20.3, nor is it available in FSF Emacs.
 681
 682 @item \w
 683 @cindex @samp{\w} in regexp
 684 matches any word-constituent character.  The editor syntax table
 685 determines which characters these are.  @xref{Syntax}.
 686
 687 @item \W
 688 @cindex @samp{\W} in regexp
 689 matches any character that is not a word constituent.
 690
 691 @item \s@var{code}
 692 @cindex @samp{\s} in regexp
 693 matches any character whose syntax is @var{code}.  Here @var{code} is a
 694 character that represents a syntax code: thus, @samp{w} for word
 695 constituent, @samp{-} for whitespace, @samp{(} for open parenthesis,
 696 etc.  @xref{Syntax}, for a list of syntax codes and the characters that
 697 stand for them.
 698
 699 @item \S@var{code}
 700 @cindex @samp{\S} in regexp
 701 matches any character whose syntax is not @var{code}.
 702 @end table
 703
 704   The following regular expression constructs match the empty string---that is,
 705 they don't use up any characters---but whether they match depends on the
 706 context.
 707
 708 @table @kbd
 709 @item \`
 710 @cindex @samp{\`} in regexp
 711 matches the empty string, but only at the beginning
 712 of the buffer or string being matched against.
 713
 714 @item \'
 715 @cindex @samp{\'} in regexp
 716 matches the empty string, but only at the end of
 717 the buffer or string being matched against.
 718
 719 @item \=
 720 @cindex @samp{\=} in regexp
 721 matches the empty string, but only at point.
 722 (This construct is not defined when matching against a string.)
 723
 724 @item \b
 725 @cindex @samp{\b} in regexp
 726 matches the empty string, but only at the beginning or
 727 end of a word.  Thus, @samp{\bfoo\b} matches any occurrence of
 728 @samp{foo} as a separate word.  @samp{\bballs?\b} matches
 729 @samp{ball} or @samp{balls} as a separate word.@refill
 730
 731 @item \B
 732 @cindex @samp{\B} in regexp
 733 matches the empty string, but @emph{not} at the beginning or
 734 end of a word.
 735
 736 @item \<
 737 @cindex @samp{\<} in regexp
 738 matches the empty string, but only at the beginning of a word.
 739
 740 @item \>
 741 @cindex @samp{\>} in regexp
 742 matches the empty string, but only at the end of a word.
 743 @end table
 744
 745   Here is a complicated regexp used by Emacs to recognize the end of a
 746 sentence together with any whitespace that follows.  It is given in Lisp
 747 syntax to enable you to distinguish the spaces from the tab characters.  In
 748 Lisp syntax, the string constant begins and ends with a double-quote.
 749 @samp{\"} stands for a double-quote as part of the regexp, @samp{\\} for a
 750 backslash as part of the regexp, @samp{\t} for a tab and @samp{\n} for a
 751 newline.
 752
 753 @example
 754 "[.?!][]\"')]*\\($\\|\t\\|  \\)[ \t\n]*"
 755 @end example
 756
 757 @noindent
 758 This regexp contains four parts: a character set matching
 759 period, @samp{?} or @samp{!}; a character set matching close-brackets,
 760 quotes or parentheses, repeated any number of times; an alternative in
 761 backslash-parentheses that matches end-of-line, a tab or two spaces; and
 762 a character set matching whitespace characters, repeated any number of
 763 times.
 764
 765 @node Search Case, Replace, Regexps, Search
 766 @section Searching and Case
 767
 768 @vindex case-fold-search
 769   All searches in Emacs normally ignore the case of the text they
 770 are searching through; if you specify searching for @samp{FOO},
 771 @samp{Foo} and @samp{foo} are also considered a match.  Regexps, and in
 772 particular character sets, are included: @samp{[aB]} matches @samp{a}
 773 or @samp{A} or @samp{b} or @samp{B}.@refill
 774
 775   If you want a case-sensitive search, set the variable
 776 @code{case-fold-search} to @code{nil}.  Then all letters must match
 777 exactly, including case. @code{case-fold-search} is a per-buffer
 778 variable; altering it affects only the current buffer, but
 779 there is a default value which you can change as well.  @xref{Locals}.
 780 You can also use @b{Case Sensitive Search} from the @b{Options} menu
 781 on your screen.
 782
 783 @node Replace, Other Repeating Search, Search Case, Search
 784 @section Replacement Commands
 785 @cindex replacement
 786 @cindex string substitution
 787 @cindex global substitution
 788
 789   Global search-and-replace operations are not needed as often in Emacs as
 790 they are in other editors, but they are available.  In addition to the
 791 simple @code{replace-string} command which is like that found in most
 792 editors, there is a @code{query-replace} command which asks you, for each
 793 occurrence of a pattern, whether to replace it.
 794
 795   The replace commands all replace one string (or regexp) with one
 796 replacement string.  It is possible to perform several replacements in
 797 parallel using the command @code{expand-region-abbrevs}.  @xref{Expanding
 798 Abbrevs}.
 799
 800 @menu
 801 * Unconditional Replace::  Replacing all matches for a string.
 802 * Regexp Replace::         Replacing all matches for a regexp.
 803 * Replacement and Case::   How replacements preserve case of letters.
 804 * Query Replace::          How to use querying.
 805 @end menu
 806
 807 @node Unconditional Replace, Regexp Replace, Replace, Replace
 808 @subsection Unconditional Replacement
 809 @findex replace-string
 810 @findex replace-regexp
 811
 812 @table @kbd
 813 @item M-x replace-string @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
 814 Replace every occurrence of @var{string} with @var{newstring}.
 815 @item M-x replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
 816 Replace every match for @var{regexp} with @var{newstring}.
 817 @end table
 818
 819   To replace every instance of @samp{foo} after point with @samp{bar},
 820 use the command @kbd{M-x replace-string} with the two arguments
 821 @samp{foo} and @samp{bar}.  Replacement occurs only after point: if you
 822 want to cover the whole buffer you must go to the beginning first.  By
 823 default, all occurrences up to the end of the buffer are replaced.  To
 824 limit replacement to part of the buffer, narrow to that part of the
 825 buffer before doing the replacement (@pxref{Narrowing}).
 826
 827   When @code{replace-string} exits, point is left at the last occurrence
 828 replaced.  The value of point when the @code{replace-string} command was
 829 issued is remembered on the mark ring; @kbd{C-u C-@key{SPC}} moves back
 830 there.
 831
 832   A numeric argument restricts replacement to matches that are surrounded
 833 by word boundaries.
 834
 835 @node Regexp Replace, Replacement and Case, Unconditional Replace, Replace
 836 @subsection Regexp Replacement
 837
 838   @code{replace-string} replaces exact matches for a single string.  The
 839 similar command @code{replace-regexp} replaces any match for a specified
 840 pattern.
 841
 842   In @code{replace-regexp}, the @var{newstring} need not be constant.  It
 843 can refer to all or part of what is matched by the @var{regexp}.  @samp{\&}
 844 in @var{newstring} stands for the entire text being replaced.
 845 @samp{\@var{d}} in @var{newstring}, where @var{d} is a digit, stands for
 846 whatever matched the @var{d}'th parenthesized grouping in @var{regexp}.
 847 For example,@refill
 848
 849 @example
 850 M-x replace-regexp @key{RET} c[ad]+r @key{RET} \&-safe @key{RET}
 851 @end example
 852
 853 @noindent
 854 would replace (for example) @samp{cadr} with @samp{cadr-safe} and @samp{cddr}
 855 with @samp{cddr-safe}.
 856
 857 @example
 858 M-x replace-regexp @key{RET} \(c[ad]+r\)-safe @key{RET} \1 @key{RET}
 859 @end example
 860
 861 @noindent
 862 would perform exactly the opposite replacements.  To include a @samp{\}
 863 in the text to replace with, you must give @samp{\\}.
 864
 865 @node Replacement and Case, Query Replace, Regexp Replace, Replace
 866 @subsection Replace Commands and Case
 867
 868 @vindex case-replace
 869 @vindex case-fold-search
 870   If the arguments to a replace command are in lower case, the command
 871 preserves case when it makes a replacement.  Thus, the following command:
 872
 873 @example
 874 M-x replace-string @key{RET} foo @key{RET} bar @key{RET}
 875 @end example
 876
 877 @noindent
 878 replaces a lower-case @samp{foo} with a lower case @samp{bar}, @samp{FOO}
 879 with @samp{BAR}, and @samp{Foo} with @samp{Bar}.  If upper-case letters are
 880 used in the second argument, they remain upper-case every time that
 881 argument is inserted.  If upper-case letters are used in the first
 882 argument, the second argument is always substituted exactly as given, with
 883 no case conversion.  Likewise, if the variable @code{case-replace} is set
 884 to @code{nil}, replacement is done without case conversion.  If
 885 @code{case-fold-search} is set to @code{nil}, case is significant in
 886 matching occurrences of @samp{foo} to replace; also, case conversion of the
 887 replacement string is not done.
 888
 889 @node Query Replace,, Replacement and Case, Replace
 890 @subsection Query Replace
 891 @cindex query replace
 892
 893 @table @kbd
 894 @item M-% @var{string} @key{RET} @var{newstring} @key{RET}
 895 @itemx M-x query-replace @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
 896 Replace some occurrences of @var{string} with @var{newstring}.
 897 @item M-x query-replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
 898 Replace some matches for @var{regexp} with @var{newstring}.
 899 @end table
 900
 901 @kindex M-%
 902 @findex query-replace
 903   If you want to change only some of the occurrences of @samp{foo} to
 904 @samp{bar}, not all of them, you can use @code{query-replace} instead of
 905 @kbd{M-%}.  This command finds occurrences of @samp{foo} one by one,
 906 displays each occurrence, and asks you whether to replace it.  A numeric
 907 argument to @code{query-replace} tells it to consider only occurrences
 908 that are bounded by word-delimiter characters.@refill
 909
 910 @findex query-replace-regexp
 911   Aside from querying, @code{query-replace} works just like
 912 @code{replace-string}, and @code{query-replace-regexp} works
 913 just like @code{replace-regexp}.@refill
 914
 915   The things you can type when you are shown an occurrence of @var{string}
 916 or a match for @var{regexp} are:
 917
 918 @kindex SPC (query-replace)
 919 @kindex DEL (query-replace)
 920 @kindex , (query-replace)
 921 @kindex ESC (query-replace)
 922 @kindex . (query-replace)
 923 @kindex ! (query-replace)
 924 @kindex ^ (query-replace)
 925 @kindex C-r (query-replace)
 926 @kindex C-w (query-replace)
 927 @kindex C-l (query-replace)
 928
 929 @c WideCommands
 930 @table @kbd
 931 @item @key{SPC}
 932 to replace the occurrence with @var{newstring}.  This preserves case, just
 933 like @code{replace-string}, provided @code{case-replace} is non-@code{nil},
 934 as it normally is.@refill
 935
 936 @item @key{DEL}
 937 to skip to the next occurrence without replacing this one.
 938
 939 @item , @r{(Comma)}
 940 to replace this occurrence and display the result.  You are then
 941 prompted for another input character.  However, since the replacement has
 942 already been made, @key{DEL} and @key{SPC} are equivalent.  At this
 943 point, you can type @kbd{C-r} (see below) to alter the replaced text.  To
 944 undo the replacement, you can type @kbd{C-x u}.
 945 This exits the @code{query-replace}.  If you want to do further
 946 replacement you must use @kbd{C-x ESC} to restart (@pxref{Repetition}).
 947
 948 @item @key{ESC}
 949 to exit without doing any more replacements.
 950
 951 @item .@: @r{(Period)}
 952 to replace this occurrence and then exit.
 953
 954 @item !
 955 to replace all remaining occurrences without asking again.
 956
 957 @item ^
 958 to go back to the location of the previous occurrence (or what used to
 959 be an occurrence), in case you changed it by mistake.  This works by
 960 popping the mark ring.  Only one @kbd{^} in a row is allowed, because
 961 only one previous replacement location is kept during @code{query-replace}.
 962
 963 @item C-r
 964 to enter a recursive editing level, in case the occurrence needs to be
 965 edited rather than just replaced with @var{newstring}.  When you are
 966 done, exit the recursive editing level with @kbd{C-M-c} and the next
 967 occurrence will be displayed.  @xref{Recursive Edit}.
 968
 969 @item C-w
 970 to delete the occurrence, and then enter a recursive editing level as
 971 in @kbd{C-r}.  Use the recursive edit to insert text to replace the
 972 deleted occurrence of @var{string}.  When done, exit the recursive
 973 editing level with @kbd{C-M-c} and the next occurrence will be
 974 displayed.
 975
 976 @item C-l
 977 to redisplay the screen and then give another answer.
 978
 979 @item C-h
 980 to display a message summarizing these options, then give another
 981 answer.
 982 @end table
 983
 984   If you type any other character, Emacs exits the @code{query-replace}, and
 985 executes the character as a command.  To restart the @code{query-replace},
 986 use @kbd{C-x @key{ESC}}, which repeats the @code{query-replace} because it
 987 used the minibuffer to read its arguments.  @xref{Repetition, C-x ESC}.
 988
 989 @node Other Repeating Search,, Replace, Search
 990 @section Other Search-and-Loop Commands
 991
 992   Here are some other commands that find matches for a regular expression.
 993 They all operate from point to the end of the buffer.
 994
 995 @findex list-matching-lines
 996 @findex occur
 997 @findex count-matches
 998 @findex delete-non-matching-lines
 999 @findex delete-matching-lines
1000 @c grosscommands
1001 @table @kbd
1002 @item M-x occur
1003 Print each line that follows point and contains a match for the
1004 specified regexp.  A numeric argument specifies the number of context
1005 lines to print before and after each matching line; the default is
1006 none.
1007
1008 @kindex C-c C-c (Occur mode)
1009 The buffer @samp{*Occur*} containing the output serves as a menu for
1010 finding occurrences in their original context.  Find an occurrence
1011 as listed in @samp{*Occur*}, position point there, and type @kbd{C-c
1012 C-c}; this switches to the buffer that was searched and moves point to
1013 the original of the same occurrence.
1014
1015 @item M-x list-matching-lines
1016 Synonym for @kbd{M-x occur}.
1017
1018 @item M-x count-matches
1019 Print the number of matches following point for the specified regexp.
1020
1021 @item M-x delete-non-matching-lines
1022 Delete each line that follows point and does not contain a match for
1023 the specified regexp.
1024
1025 @item M-x delete-matching-lines
1026 Delete each line that follows point and contains a match for the
1027 specified regexp.
1028 @end table