Electronic Document Publishing using DjVu

Abstract: Online access to complex compound documents with client side search and browsing capability is one of the key requirements of effective content management. “DjVu” is a highly efficient document image compression methodology, a file format, and a delivery platform that, when considered together, has shown to effectively address these issues. Originally developed for scanned color documents, the DjVu technology was recently expanded to electronic documents. The small file sizes and very efficient document browsing make DjVu a compelling alternative to such document interchange formats as PostScript or PDF. In addition, DjVu offers a uniform viewing experience for electronic or scanned original documents, on any platform, over any connection speed, which is ideal for digital libraries and electronic publishing. This paper describes the basics of DjVu encoding, with emphasis on the particular challenges posed by electronic sources. The DjVu Virtual Printer Driver we implemented as “Universal DjVu Converter” is then introduced. Basic performance statistics are given, and enterprise workflow applications of this technology are highlighted.

Artem Mikheev, Luc Vincent , Mike Hawrylycz and Léon Bottou: Electronic Document Publishing using DjVu, Proceedings of the IAPR International Workshop on Document Analysis (DAS'02), Princeton, NJ, August 2002.

iapr-2002.djvu iapr-2002.pdf iapr-2002.ps.gz

  author = {Mikheev, Artem and Vincent , Luc and Hawrylycz, Mike and Bottou, L\'{e}on},
  title = {Electronic Document Publishing using {DjVu}},
  booktitle = {Proceedings of the IAPR International Workshop on Document Analysis (DAS'02)},
  address = {Princeton, NJ},
  month = {August},
  year = {2002},
  url = {http://leon.bottou.org/papers/mikheev-2002},