git.chise.org Git - chise/xemacs-chise.git.1/blob - man/lispref/syntax.texi

   1 @c -*-texinfo-*-
   2 @c This is part of the XEmacs Lisp Reference Manual.
   3 @c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc.
   4 @c See the file lispref.texi for copying conditions.
   5 @setfilename ../../info/syntax.info
   6 @node Syntax Tables, Abbrevs, Searching and Matching, Top
   7 @chapter Syntax Tables
   8 @cindex parsing
   9 @cindex syntax table
  10 @cindex text parsing
  11
  12   A @dfn{syntax table} specifies the syntactic textual function of each
  13 character.  This information is used by the parsing commands, the
  14 complex movement commands, and others to determine where words, symbols,
  15 and other syntactic constructs begin and end.  The current syntax table
  16 controls the meaning of the word motion functions (@pxref{Word Motion})
  17 and the list motion functions (@pxref{List Motion}) as well as the
  18 functions in this chapter.
  19
  20 @menu
  21 * Basics: Syntax Basics.     Basic concepts of syntax tables.
  22 * Desc: Syntax Descriptors.  How characters are classified.
  23 * Syntax Table Functions::   How to create, examine and alter syntax tables.
  24 * Motion and Syntax::        Moving over characters with certain syntaxes.
  25 * Parsing Expressions::      Parsing balanced expressions
  26                                 using the syntax table.
  27 * Standard Syntax Tables::   Syntax tables used by various major modes.
  28 * Syntax Table Internals::   How syntax table information is stored.
  29 @end menu
  30
  31 @node Syntax Basics
  32 @section Syntax Table Concepts
  33
  34 @ifinfo
  35   A @dfn{syntax table} provides Emacs with the information that
  36 determines the syntactic use of each character in a buffer.  This
  37 information is used by the parsing commands, the complex movement
  38 commands, and others to determine where words, symbols, and other
  39 syntactic constructs begin and end.  The current syntax table controls
  40 the meaning of the word motion functions (@pxref{Word Motion}) and the
  41 list motion functions (@pxref{List Motion}) as well as the functions in
  42 this chapter.
  43 @end ifinfo
  44
  45   Under XEmacs 20, a syntax table is a particular subtype of the
  46 primitive char table type (@pxref{Char Tables}), and each element of the
  47 char table is an integer that encodes the syntax of the character in
  48 question, or a cons of such an integer and a matching character (for
  49 characters with parenthesis syntax).
  50
  51   Under XEmacs 19, a syntax table is a vector of 256 elements; it
  52 contains one entry for each of the 256 possible characters in an 8-bit
  53 byte.  Each element is an integer that encodes the syntax of the
  54 character in question. (The matching character, if any, is embedded
  55 in the bits of this integer.)
  56
  57   Syntax tables are used only for moving across text, not for the Emacs
  58 Lisp reader.  XEmacs Lisp uses built-in syntactic rules when reading Lisp
  59 expressions, and these rules cannot be changed.
  60
  61   Each buffer has its own major mode, and each major mode has its own
  62 idea of the syntactic class of various characters.  For example, in Lisp
  63 mode, the character @samp{;} begins a comment, but in C mode, it
  64 terminates a statement.  To support these variations, XEmacs makes the
  65 choice of syntax table local to each buffer.  Typically, each major
  66 mode has its own syntax table and installs that table in each buffer
  67 that uses that mode.  Changing this table alters the syntax in all
  68 those buffers as well as in any buffers subsequently put in that mode.
  69 Occasionally several similar modes share one syntax table.
  70 @xref{Example Major Modes}, for an example of how to set up a syntax
  71 table.
  72
  73 A syntax table can inherit the data for some characters from the
  74 standard syntax table, while specifying other characters itself.  The
  75 ``inherit'' syntax class means ``inherit this character's syntax from
  76 the standard syntax table.''  Most major modes' syntax tables inherit
  77 the syntax of character codes 0 through 31 and 128 through 255.  This is
  78 useful with character sets such as ISO Latin-1 that have additional
  79 alphabetic characters in the range 128 to 255.  Just changing the
  80 standard syntax for these characters affects all major modes.
  81
  82 @defun syntax-table-p object
  83 This function returns @code{t} if @var{object} is a vector of length 256
  84 elements.  This means that the vector may be a syntax table.  However,
  85 according to this test, any vector of length 256 is considered to be a
  86 syntax table, no matter what its contents.
  87 @end defun
  88
  89 @node Syntax Descriptors
  90 @section Syntax Descriptors
  91 @cindex syntax classes
  92
  93   This section describes the syntax classes and flags that denote the
  94 syntax of a character, and how they are represented as a @dfn{syntax
  95 descriptor}, which is a Lisp string that you pass to
  96 @code{modify-syntax-entry} to specify the desired syntax.
  97
  98   XEmacs defines a number of @dfn{syntax classes}.  Each syntax table
  99 puts each character into one class.  There is no necessary relationship
 100 between the class of a character in one syntax table and its class in
 101 any other table.
 102
 103   Each class is designated by a mnemonic character, which serves as the
 104 name of the class when you need to specify a class.  Usually the
 105 designator character is one that is frequently in that class; however,
 106 its meaning as a designator is unvarying and independent of what syntax
 107 that character currently has.
 108
 109 @cindex syntax descriptor
 110   A syntax descriptor is a Lisp string that specifies a syntax class, a
 111 matching character (used only for the parenthesis classes) and flags.
 112 The first character is the designator for a syntax class.  The second
 113 character is the character to match; if it is unused, put a space there.
 114 Then come the characters for any desired flags.  If no matching
 115 character or flags are needed, one character is sufficient.
 116
 117   For example, the descriptor for the character @samp{*} in C mode is
 118 @samp{@w{. 23}} (i.e., punctuation, matching character slot unused,
 119 second character of a comment-starter, first character of an
 120 comment-ender), and the entry for @samp{/} is @samp{@w{. 14}} (i.e.,
 121 punctuation, matching character slot unused, first character of a
 122 comment-starter, second character of a comment-ender).
 123
 124 @menu
 125 * Syntax Class Table::      Table of syntax classes.
 126 * Syntax Flags::            Additional flags each character can have.
 127 @end menu
 128
 129 @node Syntax Class Table
 130 @subsection Table of Syntax Classes
 131
 132   Here is a table of syntax classes, the characters that stand for them,
 133 their meanings, and examples of their use.
 134
 135 @deffn {Syntax class} @w{whitespace character}
 136 @dfn{Whitespace characters} (designated with @w{@samp{@ }} or @samp{-})
 137 separate symbols and words from each other.  Typically, whitespace
 138 characters have no other syntactic significance, and multiple whitespace
 139 characters are syntactically equivalent to a single one.  Space, tab,
 140 newline and formfeed are almost always classified as whitespace.
 141 @end deffn
 142
 143 @deffn {Syntax class} @w{word constituent}
 144 @dfn{Word constituents} (designated with @samp{w}) are parts of normal
 145 English words and are typically used in variable and command names in
 146 programs.  All upper- and lower-case letters, and the digits, are typically
 147 word constituents.
 148 @end deffn
 149
 150 @deffn {Syntax class} @w{symbol constituent}
 151 @dfn{Symbol constituents} (designated with @samp{_}) are the extra
 152 characters that are used in variable and command names along with word
 153 constituents.  For example, the symbol constituents class is used in
 154 Lisp mode to indicate that certain characters may be part of symbol
 155 names even though they are not part of English words.  These characters
 156 are @samp{$&*+-_<>}.  In standard C, the only non-word-constituent
 157 character that is valid in symbols is underscore (@samp{_}).
 158 @end deffn
 159
 160 @deffn {Syntax class} @w{punctuation character}
 161 @dfn{Punctuation characters} (@samp{.}) are those characters that are
 162 used as punctuation in English, or are used in some way in a programming
 163 language to separate symbols from one another.  Most programming
 164 language modes, including Emacs Lisp mode, have no characters in this
 165 class since the few characters that are not symbol or word constituents
 166 all have other uses.
 167 @end deffn
 168
 169 @deffn {Syntax class} @w{open parenthesis character}
 170 @deffnx {Syntax class} @w{close parenthesis character}
 171 @cindex parenthesis syntax
 172 Open and close @dfn{parenthesis characters} are characters used in
 173 dissimilar pairs to surround sentences or expressions.  Such a grouping
 174 is begun with an open parenthesis character and terminated with a close.
 175 Each open parenthesis character matches a particular close parenthesis
 176 character, and vice versa.  Normally, XEmacs indicates momentarily the
 177 matching open parenthesis when you insert a close parenthesis.
 178 @xref{Blinking}.
 179
 180 The class of open parentheses is designated with @samp{(}, and that of
 181 close parentheses with @samp{)}.
 182
 183 In English text, and in C code, the parenthesis pairs are @samp{()},
 184 @samp{[]}, and @samp{@{@}}.  In XEmacs Lisp, the delimiters for lists and
 185 vectors (@samp{()} and @samp{[]}) are classified as parenthesis
 186 characters.
 187 @end deffn
 188
 189 @deffn {Syntax class} @w{string quote}
 190 @dfn{String quote characters} (designated with @samp{"}) are used in
 191 many languages, including Lisp and C, to delimit string constants.  The
 192 same string quote character appears at the beginning and the end of a
 193 string.  Such quoted strings do not nest.
 194
 195 The parsing facilities of XEmacs consider a string as a single token.
 196 The usual syntactic meanings of the characters in the string are
 197 suppressed.
 198
 199 The Lisp modes have two string quote characters: double-quote (@samp{"})
 200 and vertical bar (@samp{|}).  @samp{|} is not used in XEmacs Lisp, but it
 201 is used in Common Lisp.  C also has two string quote characters:
 202 double-quote for strings, and single-quote (@samp{'}) for character
 203 constants.
 204
 205 English text has no string quote characters because English is not a
 206 programming language.  Although quotation marks are used in English,
 207 we do not want them to turn off the usual syntactic properties of
 208 other characters in the quotation.
 209 @end deffn
 210
 211 @deffn {Syntax class} @w{escape}
 212 An @dfn{escape character} (designated with @samp{\}) starts an escape
 213 sequence such as is used in C string and character constants.  The
 214 character @samp{\} belongs to this class in both C and Lisp.  (In C, it
 215 is used thus only inside strings, but it turns out to cause no trouble
 216 to treat it this way throughout C code.)
 217
 218 Characters in this class count as part of words if
 219 @code{words-include-escapes} is non-@code{nil}.  @xref{Word Motion}.
 220 @end deffn
 221
 222 @deffn {Syntax class} @w{character quote}
 223 A @dfn{character quote character} (designated with @samp{/}) quotes the
 224 following character so that it loses its normal syntactic meaning.  This
 225 differs from an escape character in that only the character immediately
 226 following is ever affected.
 227
 228 Characters in this class count as part of words if
 229 @code{words-include-escapes} is non-@code{nil}.  @xref{Word Motion}.
 230
 231 This class is used for backslash in @TeX{} mode.
 232 @end deffn
 233
 234 @deffn {Syntax class} @w{paired delimiter}
 235 @dfn{Paired delimiter characters} (designated with @samp{$}) are like
 236 string quote characters except that the syntactic properties of the
 237 characters between the delimiters are not suppressed.  Only @TeX{} mode
 238 uses a paired delimiter presently---the @samp{$} that both enters and
 239 leaves math mode.
 240 @end deffn
 241
 242 @deffn {Syntax class} @w{expression prefix}
 243 An @dfn{expression prefix operator} (designated with @samp{'}) is used
 244 for syntactic operators that are part of an expression if they appear
 245 next to one.  These characters in Lisp include the apostrophe, @samp{'}
 246 (used for quoting), the comma, @samp{,} (used in macros), and @samp{#}
 247 (used in the read syntax for certain data types).
 248 @end deffn
 249
 250 @deffn {Syntax class} @w{comment starter}
 251 @deffnx {Syntax class} @w{comment ender}
 252 @cindex comment syntax
 253 The @dfn{comment starter} and @dfn{comment ender} characters are used in
 254 various languages to delimit comments.  These classes are designated
 255 with @samp{<} and @samp{>}, respectively.
 256
 257 English text has no comment characters.  In Lisp, the semicolon
 258 (@samp{;}) starts a comment and a newline or formfeed ends one.
 259 @end deffn
 260
 261 @deffn {Syntax class} @w{inherit}
 262 This syntax class does not specify a syntax.  It says to look in the
 263 standard syntax table to find the syntax of this character.  The
 264 designator for this syntax code is @samp{@@}.
 265 @end deffn
 266
 267 @node Syntax Flags
 268 @subsection Syntax Flags
 269 @cindex syntax flags
 270
 271   In addition to the classes, entries for characters in a syntax table
 272 can include flags.  There are six possible flags, represented by the
 273 characters @samp{1}, @samp{2}, @samp{3}, @samp{4}, @samp{b} and
 274 @samp{p}.
 275
 276   All the flags except @samp{p} are used to describe multi-character
 277 comment delimiters.  The digit flags indicate that a character can
 278 @emph{also} be part of a comment sequence, in addition to the syntactic
 279 properties associated with its character class.  The flags are
 280 independent of the class and each other for the sake of characters such
 281 as @samp{*} in C mode, which is a punctuation character, @emph{and} the
 282 second character of a start-of-comment sequence (@samp{/*}), @emph{and}
 283 the first character of an end-of-comment sequence (@samp{*/}).
 284
 285 The flags for a character @var{c} are:
 286
 287 @itemize @bullet
 288 @item
 289 @samp{1} means @var{c} is the start of a two-character comment-start
 290 sequence.
 291
 292 @item
 293 @samp{2} means @var{c} is the second character of such a sequence.
 294
 295 @item
 296 @samp{3} means @var{c} is the start of a two-character comment-end
 297 sequence.
 298
 299 @item
 300 @samp{4} means @var{c} is the second character of such a sequence.
 301
 302 @item
 303 @c Emacs 19 feature
 304 @samp{b} means that @var{c} as a comment delimiter belongs to the
 305 alternative ``b'' comment style.
 306
 307 Emacs supports two comment styles simultaneously in any one syntax
 308 table.  This is for the sake of C++.  Each style of comment syntax has
 309 its own comment-start sequence and its own comment-end sequence.  Each
 310 comment must stick to one style or the other; thus, if it starts with
 311 the comment-start sequence of style ``b'', it must also end with the
 312 comment-end sequence of style ``b''.
 313
 314 The two comment-start sequences must begin with the same character; only
 315 the second character may differ.  Mark the second character of the
 316 ``b''-style comment-start sequence with the @samp{b} flag.
 317
 318 A comment-end sequence (one or two characters) applies to the ``b''
 319 style if its first character has the @samp{b} flag set; otherwise, it
 320 applies to the ``a'' style.
 321
 322 The appropriate comment syntax settings for C++ are as follows:
 323
 324 @table @asis
 325 @item @samp{/}
 326 @samp{124b}
 327 @item @samp{*}
 328 @samp{23}
 329 @item newline
 330 @samp{>b}
 331 @end table
 332
 333 This defines four comment-delimiting sequences:
 334
 335 @table @asis
 336 @item @samp{/*}
 337 This is a comment-start sequence for ``a'' style because the
 338 second character, @samp{*}, does not have the @samp{b} flag.
 339
 340 @item @samp{//}
 341 This is a comment-start sequence for ``b'' style because the second
 342 character, @samp{/}, does have the @samp{b} flag.
 343
 344 @item @samp{*/}
 345 This is a comment-end sequence for ``a'' style because the first
 346 character, @samp{*}, does not have the @samp{b} flag
 347
 348 @item newline
 349 This is a comment-end sequence for ``b'' style, because the newline
 350 character has the @samp{b} flag.
 351 @end table
 352
 353 @item
 354 @c Emacs 19 feature
 355 @samp{p} identifies an additional ``prefix character'' for Lisp syntax.
 356 These characters are treated as whitespace when they appear between
 357 expressions.  When they appear within an expression, they are handled
 358 according to their usual syntax codes.
 359
 360 The function @code{backward-prefix-chars} moves back over these
 361 characters, as well as over characters whose primary syntax class is
 362 prefix (@samp{'}).  @xref{Motion and Syntax}.
 363 @end itemize
 364
 365 @node Syntax Table Functions
 366 @section Syntax Table Functions
 367
 368   In this section we describe functions for creating, accessing and
 369 altering syntax tables.
 370
 371 @defun make-syntax-table &optional table
 372 This function creates a new syntax table.  Character codes 0 through
 373 31 and 128 through 255 are set up to inherit from the standard syntax
 374 table.  The other character codes are set up by copying what the
 375 standard syntax table says about them.
 376
 377 Most major mode syntax tables are created in this way.
 378 @end defun
 379
 380 @defun copy-syntax-table &optional table
 381 This function constructs a copy of @var{table} and returns it.  If
 382 @var{table} is not supplied (or is @code{nil}), it returns a copy of the
 383 current syntax table.  Otherwise, an error is signaled if @var{table} is
 384 not a syntax table.
 385 @end defun
 386
 387 @deffn Command modify-syntax-entry char syntax-descriptor  &optional table
 388 This function sets the syntax entry for @var{char} according to
 389 @var{syntax-descriptor}.  The syntax is changed only for @var{table},
 390 which defaults to the current buffer's syntax table, and not in any
 391 other syntax table.  The argument @var{syntax-descriptor} specifies the
 392 desired syntax; this is a string beginning with a class designator
 393 character, and optionally containing a matching character and flags as
 394 well.  @xref{Syntax Descriptors}.
 395
 396 This function always returns @code{nil}.  The old syntax information in
 397 the table for this character is discarded.
 398
 399 An error is signaled if the first character of the syntax descriptor is not
 400 one of the twelve syntax class designator characters.  An error is also
 401 signaled if @var{char} is not a character.
 402
 403 @example
 404 @group
 405 @exdent @r{Examples:}
 406
 407 ;; @r{Put the space character in class whitespace.}
 408 (modify-syntax-entry ?\  " ")
 409      @result{} nil
 410 @end group
 411
 412 @group
 413 ;; @r{Make @samp{$} an open parenthesis character,}
 414 ;;   @r{with @samp{^} as its matching close.}
 415 (modify-syntax-entry ?$ "(^")
 416      @result{} nil
 417 @end group
 418
 419 @group
 420 ;; @r{Make @samp{^} a close parenthesis character,}
 421 ;;   @r{with @samp{$} as its matching open.}
 422 (modify-syntax-entry ?^ ")$")
 423      @result{} nil
 424 @end group
 425
 426 @group
 427 ;; @r{Make @samp{/} a punctuation character,}
 428 ;;   @r{the first character of a start-comment sequence,}
 429 ;;   @r{and the second character of an end-comment sequence.}
 430 ;;   @r{This is used in C mode.}
 431 (modify-syntax-entry ?/ ". 14")
 432      @result{} nil
 433 @end group
 434 @end example
 435 @end deffn
 436
 437 @defun char-syntax character
 438 This function returns the syntax class of @var{character}, represented
 439 by its mnemonic designator character.  This @emph{only} returns the
 440 class, not any matching parenthesis or flags.
 441
 442 An error is signaled if @var{char} is not a character.
 443
 444 The following examples apply to C mode.  The first example shows that
 445 the syntax class of space is whitespace (represented by a space).  The
 446 second example shows that the syntax of @samp{/} is punctuation.  This
 447 does not show the fact that it is also part of comment-start and -end
 448 sequences.  The third example shows that open parenthesis is in the class
 449 of open parentheses.  This does not show the fact that it has a matching
 450 character, @samp{)}.
 451
 452 @example
 453 @group
 454 (char-to-string (char-syntax ?\ ))
 455      @result{} " "
 456 @end group
 457
 458 @group
 459 (char-to-string (char-syntax ?/))
 460      @result{} "."
 461 @end group
 462
 463 @group
 464 (char-to-string (char-syntax ?\())
 465      @result{} "("
 466 @end group
 467 @end example
 468 @end defun
 469
 470 @defun set-syntax-table table &optional buffer
 471 This function makes @var{table} the syntax table for @var{buffer}, which
 472 defaults to the current buffer if omitted.  It returns @var{table}.
 473 @end defun
 474
 475 @defun syntax-table &optional buffer
 476 This function returns the syntax table for @var{buffer}, which defaults
 477 to the current buffer if omitted.
 478 @end defun
 479
 480 @node Motion and Syntax
 481 @section Motion and Syntax
 482
 483   This section describes functions for moving across characters in
 484 certain syntax classes.  None of these functions exists in Emacs
 485 version 18 or earlier.
 486
 487 @defun skip-syntax-forward syntaxes &optional limit buffer
 488 This function moves point forward across characters having syntax classes
 489 mentioned in @var{syntaxes}.  It stops when it encounters the end of
 490 the buffer, or position @var{limit} (if specified), or a character it is
 491 not supposed to skip.  Optional argument @var{buffer} defaults to the
 492 current buffer if omitted.
 493 @ignore @c may want to change this.
 494 The return value is the distance traveled, which is a nonnegative
 495 integer.
 496 @end ignore
 497 @end defun
 498
 499 @defun skip-syntax-backward syntaxes &optional limit buffer
 500 This function moves point backward across characters whose syntax
 501 classes are mentioned in @var{syntaxes}.  It stops when it encounters
 502 the beginning of the buffer, or position @var{limit} (if specified), or a
 503 character it is not supposed to skip.  Optional argument @var{buffer}
 504 defaults to the current buffer if omitted.
 505
 506 @ignore @c may want to change this.
 507 The return value indicates the distance traveled.  It is an integer that
 508 is zero or less.
 509 @end ignore
 510 @end defun
 511
 512 @defun backward-prefix-chars &optional buffer
 513 This function moves point backward over any number of characters with
 514 expression prefix syntax.  This includes both characters in the
 515 expression prefix syntax class, and characters with the @samp{p} flag.
 516 Optional argument @var{buffer} defaults to the current buffer if
 517 omitted.
 518 @end defun
 519
 520 @node Parsing Expressions
 521 @section Parsing Balanced Expressions
 522
 523   Here are several functions for parsing and scanning balanced
 524 expressions, also known as @dfn{sexps}, in which parentheses match in
 525 pairs.  The syntax table controls the interpretation of characters, so
 526 these functions can be used for Lisp expressions when in Lisp mode and
 527 for C expressions when in C mode.  @xref{List Motion}, for convenient
 528 higher-level functions for moving over balanced expressions.
 529
 530 @defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment buffer
 531 This function parses a sexp in the current buffer starting at
 532 @var{start}, not scanning past @var{limit}.  It stops at position
 533 @var{limit} or when certain criteria described below are met, and sets
 534 point to the location where parsing stops.  It returns a value
 535 describing the status of the parse at the point where it stops.
 536
 537 If @var{state} is @code{nil}, @var{start} is assumed to be at the top
 538 level of parenthesis structure, such as the beginning of a function
 539 definition.  Alternatively, you might wish to resume parsing in the
 540 middle of the structure.  To do this, you must provide a @var{state}
 541 argument that describes the initial status of parsing.
 542
 543 @cindex parenthesis depth
 544 If the third argument @var{target-depth} is non-@code{nil}, parsing
 545 stops if the depth in parentheses becomes equal to @var{target-depth}.
 546 The depth starts at 0, or at whatever is given in @var{state}.
 547
 548 If the fourth argument @var{stop-before} is non-@code{nil}, parsing
 549 stops when it comes to any character that starts a sexp.  If
 550 @var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
 551 start of a comment.
 552
 553 @cindex parse state
 554 The fifth argument @var{state} is an eight-element list of the same
 555 form as the value of this function, described below.  The return value
 556 of one call may be used to initialize the state of the parse on another
 557 call to @code{parse-partial-sexp}.
 558
 559 The result is a list of eight elements describing the final state of
 560 the parse:
 561
 562 @enumerate 0
 563 @item
 564 The depth in parentheses, counting from 0.
 565
 566 @item
 567 @cindex innermost containing parentheses
 568 The character position of the start of the innermost parenthetical
 569 grouping containing the stopping point; @code{nil} if none.
 570
 571 @item
 572 @cindex previous complete subexpression
 573 The character position of the start of the last complete subexpression
 574 terminated; @code{nil} if none.
 575
 576 @item
 577 @cindex inside string
 578 Non-@code{nil} if inside a string.  More precisely, this is the
 579 character that will terminate the string.
 580
 581 @item
 582 @cindex inside comment
 583 @code{t} if inside a comment (of either style).
 584
 585 @item
 586 @cindex quote character
 587 @code{t} if point is just after a quote character.
 588
 589 @item
 590 The minimum parenthesis depth encountered during this scan.
 591
 592 @item
 593 @code{t} if inside a comment of style ``b''.
 594 @end enumerate
 595
 596 Elements 0, 3, 4, 5 and 7 are significant in the argument @var{state}.
 597
 598 @cindex indenting with parentheses
 599 This function is most often used to compute indentation for languages
 600 that have nested parentheses.
 601 @end defun
 602
 603 @defun scan-lists from count depth &optional buffer noerror
 604 This function scans forward @var{count} balanced parenthetical groupings
 605 from character number @var{from}.  It returns the character position
 606 where the scan stops.
 607
 608 If @var{depth} is nonzero, parenthesis depth counting begins from that
 609 value.  The only candidates for stopping are places where the depth in
 610 parentheses becomes zero; @code{scan-lists} counts @var{count} such
 611 places and then stops.  Thus, a positive value for @var{depth} means go
 612 out @var{depth} levels of parenthesis.
 613
 614 Scanning ignores comments if @code{parse-sexp-ignore-comments} is
 615 non-@code{nil}.
 616
 617 If the scan reaches the beginning or end of the buffer (or its
 618 accessible portion), and the depth is not zero, an error is signaled.
 619 If the depth is zero but the count is not used up, @code{nil} is
 620 returned.
 621
 622 If optional arg @var{buffer} is non-@code{nil}, scanning occurs in that
 623 buffer instead of in the current buffer.
 624
 625 If optional arg @var{noerror} is non-@code{nil}, @code{scan-lists}
 626 will return @code{nil} instead of signalling an error.
 627 @end defun
 628
 629 @defun scan-sexps from count &optional buffer noerror
 630 This function scans forward @var{count} sexps from character position
 631 @var{from}.  It returns the character position where the scan stops.
 632
 633 Scanning ignores comments if @code{parse-sexp-ignore-comments} is
 634 non-@code{nil}.
 635
 636 If the scan reaches the beginning or end of (the accessible part of) the
 637 buffer in the middle of a parenthetical grouping, an error is signaled.
 638 If it reaches the beginning or end between groupings but before count is
 639 used up, @code{nil} is returned.
 640
 641 If optional arg @var{buffer} is non-@code{nil}, scanning occurs in
 642 that buffer instead of in the current buffer.
 643
 644 If optional arg @var{noerror} is non-@code{nil}, @code{scan-sexps}
 645 will return nil instead of signalling an error.
 646 @end defun
 647
 648 @defvar parse-sexp-ignore-comments
 649 @cindex skipping comments
 650 If the value is non-@code{nil}, then comments are treated as
 651 whitespace by the functions in this section and by @code{forward-sexp}.
 652
 653 In older Emacs versions, this feature worked only when the comment
 654 terminator is something like @samp{*/}, and appears only to end a
 655 comment.  In languages where newlines terminate comments, it was
 656 necessary make this variable @code{nil}, since not every newline is the
 657 end of a comment.  This limitation no longer exists.
 658 @end defvar
 659
 660 You can use @code{forward-comment} to move forward or backward over
 661 one comment or several comments.
 662
 663 @defun forward-comment count &optional buffer
 664 This function moves point forward across @var{count} comments (backward,
 665 if @var{count} is negative).  If it finds anything other than a comment
 666 or whitespace, it stops, leaving point at the place where it stopped.
 667 It also stops after satisfying @var{count}.
 668
 669   Optional argument @var{buffer} defaults to the current buffer.
 670 @end defun
 671
 672 To move forward over all comments and whitespace following point, use
 673 @code{(forward-comment (buffer-size))}.  @code{(buffer-size)} is a good
 674 argument to use, because the number of comments in the buffer cannot
 675 exceed that many.
 676
 677 @node Standard Syntax Tables
 678 @section Some Standard Syntax Tables
 679
 680   Most of the major modes in XEmacs have their own syntax tables.  Here
 681 are several of them:
 682
 683 @defun standard-syntax-table
 684 This function returns the standard syntax table, which is the syntax
 685 table used in Fundamental mode.
 686 @end defun
 687
 688 @defvar text-mode-syntax-table
 689 The value of this variable is the syntax table used in Text mode.
 690 @end defvar
 691
 692 @defvar c-mode-syntax-table
 693 The value of this variable is the syntax table for C-mode buffers.
 694 @end defvar
 695
 696 @defvar emacs-lisp-mode-syntax-table
 697 The value of this variable is the syntax table used in Emacs Lisp mode
 698 by editing commands.  (It has no effect on the Lisp @code{read}
 699 function.)
 700 @end defvar
 701
 702 @node Syntax Table Internals
 703 @section Syntax Table Internals
 704 @cindex syntax table internals
 705
 706   Each element of a syntax table is an integer that encodes the syntax
 707 of one character: the syntax class, possible matching character, and
 708 flags.  Lisp programs don't usually work with the elements directly; the
 709 Lisp-level syntax table functions usually work with syntax descriptors
 710 (@pxref{Syntax Descriptors}).
 711
 712   The low 8 bits of each element of a syntax table indicate the
 713 syntax class.
 714
 715 @table @asis
 716 @item @i{Integer}
 717 @i{Class}
 718 @item 0
 719 whitespace
 720 @item 1
 721 punctuation
 722 @item 2
 723 word
 724 @item 3
 725 symbol
 726 @item 4
 727 open parenthesis
 728 @item 5
 729 close parenthesis
 730 @item 6
 731 expression prefix
 732 @item 7
 733 string quote
 734 @item 8
 735 paired delimiter
 736 @item 9
 737 escape
 738 @item 10
 739 character quote
 740 @item 11
 741 comment-start
 742 @item 12
 743 comment-end
 744 @item 13
 745 inherit
 746 @end table
 747
 748   The next 8 bits are the matching opposite parenthesis (if the
 749 character has parenthesis syntax); otherwise, they are not meaningful.
 750 The next 6 bits are the flags.