10. How do I get a plain text man
page without all that ^H^_ stuff?
Have a look at col(1), because
col can filter out backspace sequences. Just in
case you can't wait that long:
funnyprompt$ groff -t -e -mandoc -Tascii manpage.1 |
col -bx > manpage.txt
The -t and -e switches
tell groff to preprocess using
tbl and eqn. This is overkill
for man pages that don't require preprocessing but it does no
harm apart from a few CPU cycles wasted. On the other hand, not
using -t when it is actually required does harm:
the table is terribly formatted. You can even find out (well,
"guess" is a better word) what command is needed to
format a certain groff document (not just man
pages) by issuing
funnyprompt$ grog /usr/man/man7/signal.7
groff -t -man /usr/man/man7/signal.7 |
"Grog" stands for "GROff Guess", and it
does what it says--guess. If it were perfect we wouldn't need
options any more. I've seen it guess incorrectly on macro
packages and on preprocessors. Here is a little perl script I wrote
that can delete the page headers and footers, thereby saving you a
few pages (and mother nature a tree) when printing long and
elaborate man pages. Save it in a file named
strip-headers & chmod 755.
#!/usr/bin/perl -wn
# make it slurp the whole file at once:
undef $/;
# delete first header:
s/^\n*.*\n+//;
# delete last footer:
s/\n+.*\n+$/\n/g;
# delete page breaks:
s/\n\n+[^ \t].*\n\n+(\S+).*\1\n\n+/\n/g;
# collapse two or more blank lines into a single one:
s/\n{3,}/\n\n/g;
# see what's left...
print; |
You have to use it as the first filter after the
man command as it relies on the number of
newlines being output by groff. For
example:
funnyprompt$ man bash | strip-headers | col -bx >
bash.txt