3. Structural markup: a primer

Older formatting languages like Tex, Texinfo, and Troff supported presentation markup. In these systems, the instructions you gave were about the appearance and physical layout of the text (font changes, indentation changes, that sort of thing).

Presentation markup was adequate as long as your objective was to print to a single medium or type of display device. You run into its limits, however, when you want to mark up a document so that (a) it can be formatted for very different display media (such as printing vs. Web display), or (b) you want to support searching and indexing the document by its logical structure (as you are likely to want to do, for example, if you are incorporating it into a hypertext system).

To support these capabilities properly, you need a system of structural markup. In structural markup, you describe not the physical appearance of the document but the logical properties of its parts.

As an example: In a presentation-markup language, if you want to emphasize a word, you might instruct the formatter to set it in boldface. In troff(1) this would look like so:

All your base
.B are
belong to us!

In a structural-markup language, you would tell the formatter to emphasize the word:

All your base <emphasis>are</emphasis> belong to us!

The "<emphasis>" and </emphasis>in the line above are called markup tags, or just tags for short. They are the instructions to your formatter.

In a structural-markup language, the physical appearance of the final document would be controlled by a stylesheet . It is the stylesheet that would tell the formatter "render emphasis as a font change to boldface". One advantage of structural-markup languages is that by changing a stylesheet you can globally change the presentation of the document (to use different fonts, for example) without having to hack all the the individual instances of (say) .B in the document itself.