soffice2html.pl
2006-01-22
OpenOffice.orgのファイルをhtmlファイルに変換するPerlスクリプトです。
グッデイ版では、若干の不具合が改善されている模様(0.76)ですが、オリジナル最新版に反映されているかは不明。
あと、SXW用となっているけど、ODFでも動くか未確認です。
StarOffice/OpenOfficeのsxwファイルをhtmlファイルに変換します(画像、表、ほとんどの書式を含む)。
ImageMagick's convertとxsltproc(gnome libxsltプロジェクト)が必要です。
いくつかのオプションを提供しており、-h か引数なしで実行するとヘルプが表示されます。
Many thanks to Adrend van Beelen jr (http://www.liacs.nl/~dvbeelen) for xslt
fixes and additional styles.
Author: Steve Slaven - http://hoopajoo.net
--------------------------------------------------------------------------------
Usage: $0 [-hqvwp] [-e encoding] [-i imagedir]
[-s style_base] [-t toc_class] FILE
-h This help
-q Quiet mode (deprecated, on by default)
-v Verbose mode (used to be default)
-i Image directory (default image/)
-s CSS base style that all content is wrapped in (default soffice)
-t Class name to build TOC from (default none)
-e Output encoding, it will now try to autodetect it but you can
override it with this switch (e.g. -e iso-8859-2)
-p Output png files instead of jpg files
-w Wrap output with <html></html>
-c Path to "convert" binary
-x Path to "xsltproc" binary
-T Generate TOC only
-B Generate body only
Converts the FILE to content.html with no standard body wrapper so it can
be inserted in to existing templates. All images are converted to JPG
and all styles are converted to CSS included in the content.html. Requires
ImageMagick's 'convert', and 'xsltproc' from libxml.
If you use -i, it will be reflected in the html file but you will need to
rename the image directory, mostly because I felt that it was unsafe in
the case that -i was "." and could delete stuff it shouldn't, since the
image/ directory is deleted when doing the conversion.
2005-01-05 10:14 bpk
* soffice2html.xsl: Handlers for table borders, some fixes to the
xlate-px option in the styles handler, and corrected width
handling.
2005-01-04 21:08 bpk
* soffice2html-frontend.pl: Fixed and tested replace list for smart
quotes
2005-01-04 21:04 bpk
* soffice2html-frontend.pl: Code to strip smart quotes from
content.xml before conversion
2005-01-04 20:46 bpk
* soffice2html-frontend.pl: Command line flags to set path to
xsltproc/convert, flags to toggle body or toc only generation
2004-12-13 14:02 bpk
* soffice2html.xsl: increment version number
2004-02-06 18:11 bpk
* soffice2html-frontend.pl, soffice2html.xsl: Fixed many css
related length bugs, fixed wrapper code for full body support
2004-02-04 16:04 bpk
* README, soffice2html-frontend.pl, soffice2html.xsl: Several fixes
for buggy image support, new list styles, impoved TOC handling,
ability to generate a full HTML doc with html/body tags, and
several minor formatting enhancements.
2003-05-12 23:04 psocccer
* MANIFEST: Added manifest file for building
2003-04-30 09:09 psocccer
* soffice2html-frontend.pl, soffice2html.xsl: Added output encoding
support and auto detection for non utf-8 charsets
2003-04-29 09:05 psocccer
* soffice2html-frontend.pl: Moved @params for xsltproc, older
versions needed them as the first args
2003-04-29 09:01 psocccer
* make-soffice2html, soffice2html-frontend.pl: Fixed version string
bug, added gpl
2003-04-28 16:45 psocccer
* README, make-soffice2html, soffice2html-frontend.pl,
soffice2html.xsl: Initial revision
2003-04-28 16:45 psocccer
* README, make-soffice2html, soffice2html-frontend.pl,
soffice2html.xsl: Initial import