+Gnus parameters
+
+Add
+@lisp
+(const :tag "Spam: Blackbox" (spam spam-use-blackbox))
+(const :tag "Ham: Blackbox" (ham spam-use-blackbox))
+@end lisp
+to the @code{spam-process} group parameter in @code{gnus.el}. Make
+sure you do it twice, once for the parameter and once for the
+variable customization.
+
+Add
+@lisp
+(variable-item spam-use-blackbox)
+@end lisp
+to the @code{spam-autodetect-methods} group parameter in
+@code{gnus.el}.
+
+@end enumerate
+
+
+@node Filtering Spam Using Statistics with spam-stat
+@subsection Filtering Spam Using Statistics with spam-stat
+@cindex Paul Graham
+@cindex Graham, Paul
+@cindex naive Bayesian spam filtering
+@cindex Bayesian spam filtering, naive
+@cindex spam filtering, naive Bayesian
+
+Paul Graham has written an excellent essay about spam filtering using
+statistics: @uref{http://www.paulgraham.com/spam.html,A Plan for
+Spam}. In it he describes the inherent deficiency of rule-based
+filtering as used by SpamAssassin, for example: Somebody has to write
+the rules, and everybody else has to install these rules. You are
+always late. It would be much better, he argues, to filter mail based
+on whether it somehow resembles spam or non-spam. One way to measure
+this is word distribution. He then goes on to describe a solution
+that checks whether a new mail resembles any of your other spam mails
+or not.
+
+The basic idea is this: Create a two collections of your mail, one
+with spam, one with non-spam. Count how often each word appears in
+either collection, weight this by the total number of mails in the
+collections, and store this information in a dictionary. For every
+word in a new mail, determine its probability to belong to a spam or a
+non-spam mail. Use the 15 most conspicuous words, compute the total
+probability of the mail being spam. If this probability is higher
+than a certain threshold, the mail is considered to be spam.
+
+Gnus supports this kind of filtering. But it needs some setting up.
+First, you need two collections of your mail, one with spam, one with
+non-spam. Then you need to create a dictionary using these two
+collections, and save it. And last but not least, you need to use
+this dictionary in your fancy mail splitting rules.
+
+@menu
+* Creating a spam-stat dictionary::
+* Splitting mail using spam-stat::
+* Low-level interface to the spam-stat dictionary::
+@end menu
+
+@node Creating a spam-stat dictionary
+@subsubsection Creating a spam-stat dictionary
+
+Before you can begin to filter spam based on statistics, you must
+create these statistics based on two mail collections, one with spam,
+one with non-spam. These statistics are then stored in a dictionary
+for later use. In order for these statistics to be meaningful, you
+need several hundred emails in both collections.
+
+Gnus currently supports only the nnml back end for automated dictionary
+creation. The nnml back end stores all mails in a directory, one file
+per mail. Use the following:
+
+@defun spam-stat-process-spam-directory
+Create spam statistics for every file in this directory. Every file
+is treated as one spam mail.
+@end defun
+
+@defun spam-stat-process-non-spam-directory
+Create non-spam statistics for every file in this directory. Every
+file is treated as one non-spam mail.
+@end defun
+
+Usually you would call @code{spam-stat-process-spam-directory} on a
+directory such as @file{~/Mail/mail/spam} (this usually corresponds
+the the group @samp{nnml:mail.spam}), and you would call
+@code{spam-stat-process-non-spam-directory} on a directory such as
+@file{~/Mail/mail/misc} (this usually corresponds the the group
+@samp{nnml:mail.misc}).
+
+When you are using @acronym{IMAP}, you won't have the mails available
+locally, so that will not work. One solution is to use the Gnus Agent
+to cache the articles. Then you can use directories such as
+@file{"~/News/agent/nnimap/mail.yourisp.com/personal_spam"} for
+@code{spam-stat-process-spam-directory}. @xref{Agent as Cache}.
+
+@defvar spam-stat
+This variable holds the hash-table with all the statistics---the
+dictionary we have been talking about. For every word in either
+collection, this hash-table stores a vector describing how often the
+word appeared in spam and often it appeared in non-spam mails.
+@end defvar
+
+If you want to regenerate the statistics from scratch, you need to
+reset the dictionary.
+
+@defun spam-stat-reset
+Reset the @code{spam-stat} hash-table, deleting all the statistics.
+@end defun
+
+When you are done, you must save the dictionary. The dictionary may
+be rather large. If you will not update the dictionary incrementally
+(instead, you will recreate it once a month, for example), then you
+can reduce the size of the dictionary by deleting all words that did
+not appear often enough or that do not clearly belong to only spam or
+only non-spam mails.
+
+@defun spam-stat-reduce-size
+Reduce the size of the dictionary. Use this only if you do not want
+to update the dictionary incrementally.
+@end defun
+
+@defun spam-stat-save
+Save the dictionary.
+@end defun
+
+@defvar spam-stat-file
+The filename used to store the dictionary. This defaults to
+@file{~/.spam-stat.el}.
+@end defvar
+
+@node Splitting mail using spam-stat
+@subsubsection Splitting mail using spam-stat
+
+In order to use @code{spam-stat} to split your mail, you need to add the
+following to your @file{~/.gnus.el} file:
+
+@lisp
+(require 'spam-stat)
+(spam-stat-load)
+@end lisp
+
+This will load the necessary Gnus code, and the dictionary you
+created.
+
+Next, you need to adapt your fancy splitting rules: You need to
+determine how to use @code{spam-stat}. The following examples are for
+the nnml back end. Using the nnimap back end works just as well. Just
+use @code{nnimap-split-fancy} instead of @code{nnmail-split-fancy}.
+
+In the simplest case, you only have two groups, @samp{mail.misc} and
+@samp{mail.spam}. The following expression says that mail is either
+spam or it should go into @samp{mail.misc}. If it is spam, then
+@code{spam-stat-split-fancy} will return @samp{mail.spam}.
+
+@lisp
+(setq nnmail-split-fancy
+ `(| (: spam-stat-split-fancy)
+ "mail.misc"))
+@end lisp
+
+@defvar spam-stat-split-fancy-spam-group
+The group to use for spam. Default is @samp{mail.spam}.
+@end defvar
+
+If you also filter mail with specific subjects into other groups, use
+the following expression. Only mails not matching the regular
+expression are considered potential spam.
+
+@lisp
+(setq nnmail-split-fancy
+ `(| ("Subject" "\\bspam-stat\\b" "mail.emacs")
+ (: spam-stat-split-fancy)
+ "mail.misc"))
+@end lisp
+
+If you want to filter for spam first, then you must be careful when
+creating the dictionary. Note that @code{spam-stat-split-fancy} must
+consider both mails in @samp{mail.emacs} and in @samp{mail.misc} as
+non-spam, therefore both should be in your collection of non-spam
+mails, when creating the dictionary!
+
+@lisp
+(setq nnmail-split-fancy
+ `(| (: spam-stat-split-fancy)
+ ("Subject" "\\bspam-stat\\b" "mail.emacs")
+ "mail.misc"))
+@end lisp
+
+You can combine this with traditional filtering. Here, we move all
+HTML-only mails into the @samp{mail.spam.filtered} group. Note that since
+@code{spam-stat-split-fancy} will never see them, the mails in
+@samp{mail.spam.filtered} should be neither in your collection of spam mails,
+nor in your collection of non-spam mails, when creating the
+dictionary!
+
+@lisp
+(setq nnmail-split-fancy
+ `(| ("Content-Type" "text/html" "mail.spam.filtered")
+ (: spam-stat-split-fancy)
+ ("Subject" "\\bspam-stat\\b" "mail.emacs")
+ "mail.misc"))
+@end lisp
+
+
+@node Low-level interface to the spam-stat dictionary
+@subsubsection Low-level interface to the spam-stat dictionary
+
+The main interface to using @code{spam-stat}, are the following functions:
+
+@defun spam-stat-buffer-is-spam
+Called in a buffer, that buffer is considered to be a new spam mail.
+Use this for new mail that has not been processed before.
+@end defun
+
+@defun spam-stat-buffer-is-no-spam
+Called in a buffer, that buffer is considered to be a new non-spam
+mail. Use this for new mail that has not been processed before.
+@end defun
+
+@defun spam-stat-buffer-change-to-spam
+Called in a buffer, that buffer is no longer considered to be normal
+mail but spam. Use this to change the status of a mail that has
+already been processed as non-spam.
+@end defun
+
+@defun spam-stat-buffer-change-to-non-spam
+Called in a buffer, that buffer is no longer considered to be spam but
+normal mail. Use this to change the status of a mail that has already
+been processed as spam.
+@end defun
+
+@defun spam-stat-save
+Save the hash table to the file. The filename used is stored in the
+variable @code{spam-stat-file}.
+@end defun
+
+@defun spam-stat-load
+Load the hash table from a file. The filename used is stored in the
+variable @code{spam-stat-file}.
+@end defun
+
+@defun spam-stat-score-word
+Return the spam score for a word.
+@end defun
+
+@defun spam-stat-score-buffer
+Return the spam score for a buffer.
+@end defun
+
+@defun spam-stat-split-fancy
+Use this function for fancy mail splitting. Add the rule @samp{(:
+spam-stat-split-fancy)} to @code{nnmail-split-fancy}
+@end defun
+
+Make sure you load the dictionary before using it. This requires the
+following in your @file{~/.gnus.el} file:
+
+@lisp
+(require 'spam-stat)
+(spam-stat-load)
+@end lisp
+
+Typical test will involve calls to the following functions:
+
+@smallexample
+Reset: (setq spam-stat (make-hash-table :test 'equal))
+Learn spam: (spam-stat-process-spam-directory "~/Mail/mail/spam")
+Learn non-spam: (spam-stat-process-non-spam-directory "~/Mail/mail/misc")
+Save table: (spam-stat-save)
+File size: (nth 7 (file-attributes spam-stat-file))
+Number of words: (hash-table-count spam-stat)
+Test spam: (spam-stat-test-directory "~/Mail/mail/spam")
+Test non-spam: (spam-stat-test-directory "~/Mail/mail/misc")
+Reduce table size: (spam-stat-reduce-size)
+Save table: (spam-stat-save)
+File size: (nth 7 (file-attributes spam-stat-file))
+Number of words: (hash-table-count spam-stat)
+Test spam: (spam-stat-test-directory "~/Mail/mail/spam")
+Test non-spam: (spam-stat-test-directory "~/Mail/mail/misc")
+@end smallexample
+
+Here is how you would create your dictionary:
+
+@smallexample
+Reset: (setq spam-stat (make-hash-table :test 'equal))
+Learn spam: (spam-stat-process-spam-directory "~/Mail/mail/spam")
+Learn non-spam: (spam-stat-process-non-spam-directory "~/Mail/mail/misc")
+Repeat for any other non-spam group you need...
+Reduce table size: (spam-stat-reduce-size)
+Save table: (spam-stat-save)
+@end smallexample
+
+@node Other modes
+@section Interaction with other modes
+
+@subsection Dired
+@cindex dired
+
+@code{gnus-dired-minor-mode} provided some useful functions for dired
+buffers. It is enabled with
+@lisp
+(add-hook 'dired-mode-hook 'turn-on-gnus-dired-mode)
+@end lisp
+
+@table @kbd
+@item C-c C-m C-a
+@findex gnus-dired-attach
+Send dired's marked files as an attachment (@code{gnus-dired-attach}).
+You will be prompted for a message buffer.
+
+@item C-c C-m C-l
+@findex gnus-dired-find-file-mailcap
+Visit a file according to the appropriate mailcap entry
+(@code{gnus-dired-find-file-mailcap}). With prefix, open file in a new
+buffer.
+
+@item C-c C-m C-p
+@findex gnus-dired-print
+Print file according to the mailcap entry (@code{gnus-dired-print}). If
+there is no print command, print in a PostScript image.
+@end table
+
+@node Various Various
+@section Various Various
+@cindex mode lines
+@cindex highlights
+
+@table @code
+
+@item gnus-home-directory
+@vindex gnus-home-directory
+All Gnus file and directory variables will be initialized from this
+variable, which defaults to @file{~/}.
+
+@item gnus-directory
+@vindex gnus-directory
+Most Gnus storage file and directory variables will be initialized from
+this variable, which defaults to the @env{SAVEDIR} environment
+variable, or @file{~/News/} if that variable isn't set.
+
+Note that gnus is mostly loaded when the @file{.gnus.el} file is read.
+This means that other directory variables that are initialized from this
+variable won't be set properly if you set this variable in
+@file{.gnus.el}. Set this variable in @file{.emacs} instead.
+
+@item gnus-default-directory
+@vindex gnus-default-directory
+Not related to the above variable at all---this variable says what the
+default directory of all Gnus buffers should be. If you issue commands
+like @kbd{C-x C-f}, the prompt you'll get starts in the current buffer's
+default directory. If this variable is @code{nil} (which is the
+default), the default directory will be the default directory of the
+buffer you were in when you started Gnus.
+
+@item gnus-verbose
+@vindex gnus-verbose
+This variable is an integer between zero and ten. The higher the value,
+the more messages will be displayed. If this variable is zero, Gnus
+will never flash any messages, if it is seven (which is the default),
+most important messages will be shown, and if it is ten, Gnus won't ever
+shut up, but will flash so many messages it will make your head swim.
+
+@item gnus-verbose-backends
+@vindex gnus-verbose-backends
+This variable works the same way as @code{gnus-verbose}, but it applies
+to the Gnus back ends instead of Gnus proper.
+
+@item nnheader-max-head-length
+@vindex nnheader-max-head-length
+When the back ends read straight heads of articles, they all try to read
+as little as possible. This variable (default 4096) specifies
+the absolute max length the back ends will try to read before giving up
+on finding a separator line between the head and the body. If this
+variable is @code{nil}, there is no upper read bound. If it is
+@code{t}, the back ends won't try to read the articles piece by piece,
+but read the entire articles. This makes sense with some versions of
+@code{ange-ftp} or @code{efs}.
+
+@item nnheader-head-chop-length
+@vindex nnheader-head-chop-length
+This variable (default 2048) says how big a piece of each article to
+read when doing the operation described above.
+
+@item nnheader-file-name-translation-alist
+@vindex nnheader-file-name-translation-alist
+@cindex file names
+@cindex invalid characters in file names
+@cindex characters in file names
+This is an alist that says how to translate characters in file names.
+For instance, if @samp{:} is invalid as a file character in file names
+on your system (you OS/2 user you), you could say something like:
+
+@lisp
+@group
+(setq nnheader-file-name-translation-alist
+ '((?: . ?_)))
+@end group
+@end lisp