Abstract

Vilistextum is a html to ascii converter specifically programmed to output ascii text suitable for reading.

Some features:

REQUIREMENT:

For the main program a decent gcc installation suffices.

If you want to use the GUI-frontend, you need to have installed kaptain.

INSTALL:

./configure
make
make install (as root)

DOWNLOAD:

vilistextum-2.4.0.tar.gz
vilistextum-2.4.0.tar.bz2

USAGE:

vilistextum [OPTIONS] [inputfile|-] [outputfile|-]

This is the command line program.

kilistextum

GUI-frontend using kaptain. Its usage should be obvious, even if you haven't read this manual.
Start with "kilistextum". The makefile tries to guess where kaptain resides. If it fails you can add something like "#!/pathto/kaptain" to the first line or start it with "kaptain kilistextum".

Command line arguments

inputfile,- resp. outputfile,-
Replace inputfile with '-' for reading from standard input, likewise outputfile with '-' for writing to standard output.
--version
Reports version number and release date.
-h,--help
Prints a list of the command line options.

-c, --convert-tags
Some of the tags will be converted to special characters.
Eg: "<B>Bold</B> isn't <I>italic</I> isn't <U>underlined</U> isn't <EM>emphasized<EM> but is like <STRONG>strong</STRONG>."
will be output as "*Bold* isn't /italic/ isn't _underlined_ isn't /emphasized/ but is like *strong*."
-p, --palm
This outputs text more suitable for reading on a PDA.
Palm textreader do their own wordwrapping, so the width is set to infinity and the program doesn't rightjustify or center the text.
-w, --width number
The width of the output text.
Default: 72.
-m, --nomicrosoft
The entities from windows1252 that are &#128 - &#159 and their proper names will not be converted.

-i, --defimage string
IMG tags without alt attribute are output as [string].
Default: Image.
-r, --remove-empty-alt
If there is an empty ALT attribute in a IMG tag (eg <IMG href="..." alt='">), don't output '[]'.
-s, --shrinklines
If there are more than two newlines, output only two. There is at most one completely empty line.
-l, --links
Numbers the links in the document and creates footnotes of each link at the end of the file. Similar to 'lynx -dump'. Note: Relative URIs are not resolved and won't be printed.
-e, --errorlevel number
Increase level of verbosity for error messages.
 0: No error messages
 1: Show unrecognized entities
 2: Show unknown tags
>2: Mostly debugging information

BUGS and similar features:

The parsing of tables is not very good.
Character sets other than latin1 are not yet fully supported.
The handling of OL is broken. The program treats it as UL and more than 6 nested lists confuse it.
Text is never justified.

How to read HTML mail with gnus or mutt using vilistextum

If you want to use vilistextum for automatically converting html mails to ascii read HTMLMAIL or this

Bugreports or comments:

You can send your comments or bugreports in english or german to this address. If you've discovered a bug, please give the link or attach a copy of the html file that caused that particular bug.
Patric Müller
Last modified: Mon Sep 3 01:09:29 CEST 2001