TIGER OF THE STRIPE

Tiger of the Stripe Blog

Publishing in ConTeXt: the Search for Structured Input and Beautiful Printed Books

For years I have typeset books for Tiger of the Stripe using Adobe's desktop publishing program (DTP), InDesign. Its great strength is the level and subtlety of typographical control it offers. Its greatest weaknesses are: (1) the poor support of all DTP systems when it comes to structured input and consequent lack of consistent output; and (2) Adobe's subscription model. It would be bad enough if one had to pay a hefty subscription for InDesign and that meant that the program was really stable and getting better all the time. However, the opposite seems to have happened; Adobe sits on its cash pile and does little to improve the program or even fix bugs. Its XML support is useless, probably because Adobe wants to sell you a subscription to FrameMaker. The unstructured nature of input means that InDesign files are full of garbage. If your final product is a printed book, you probably won't notice this, but what if you want to produce an ePub or Kindle edition? To be fair, InDesign makes quite a good stab at making ePub3s – at least fixed-layout ones. But reflowable ones require quite a lot of extra work in an XML editor to make them work properly, and, as I said, they're full of garbage, making the file sizes bigger than they need to be and making editing them that bit more difficult. The rebellion against the subscription model has seen a renewed Quark XPress and the emergence of Serif Publisher. However, neither of these offers the more structured approach i would like.

When I worked in scientific publishing, I used various flavours of TeX and LaTeX. LaTeX is a very good program for setting mathematics (and text) and offers automatic page makeup and automated tools for indexes, cross-references and bibliographies. However, LaTeX doesn't offer such a structured approach as XML, and controlling the typography and layout is quite hard.

Later I worked briefly for a company specialising in XML-based typesetting. Frankly, I wasn't very impressed with the software – it was hard to learn and seemed rather difficult to bend to one's will. Needless to say, it was also very expensive. Nonetheless, the idea of using XML as the input file is immensely appealing. Using XML forces you to define the structure of the document precisely, something I wish I had been able to do when typesetting Kennedy's New Latin Primer. It also separates the structure and content of the text from presentational elements (in fact, the XML text file itself doesn't deal with presentation), just as HTML and CSS are separated in web pages (at least, that is best practice).

The trouble with XML is that it is completely flexible. That doesn't sound like a problem but it is because it means that there are many different flavours of XML, and that means there is no standard way to transform it into a printed book, an ebook, a website, or whatever. Also, because the presentational elements of the document are separate from the XML text file, you have to have a way to transform the file. You can do this with eXtensible Stylesheet Language Transformations (XSLT – don't you hate the jargon?). These aren't too bad for producing HTML (which is, after all, a subset of XML). However, if you want to create a PDF file for a printed book, it gets really complicated. If you look at most of the PDF documents produced from XML this way, they're really not up to publishing standards.

For many years, I've been looking with interest at something called the Text Encoding Initiative (TEI). It has come up with a set of incredibly thorough guidelines for coding books and articles – so thorough, in fact that their guidelines in PDF form take up 1870 pages. Fortunately, there is also a 'Lite' version. Even so, this doesn't solve the problem of transforming your text into print.

I've looked at various solutions for transforming XML (probably TEI Lite) to print and none of them is perfect. However, the best seems to be ConTeXt, a derivative of TeX. This is freeware, it understands XML and it has very good typographical controls. Now, I've just got to learn the TEI coding and the typographical and layout commands for ConTeXt. Will I live long enough to produce a book this way?