Markup
- Text is either "raw", or else marked up in some way.
- The term "markup" has its origins in the publishing industry, where
it refers to annotations on a manuscript that are used to indicate layout,
font-size etc.
- Every electronic document standard involves markup which is not actually part
of the text, but explains something about the structure of the text.
- Markup for layout versus markup for content.
- Examples of electronic markup languages are RTF, PostScript, HTML, SGML and
XML.
- Most word processing software hides the markup from the user, who only sees
the end result.
- Normally, when dealing with corpora, we will want explicit markup that we can
see. For the most part, this is manipulated with an ordinary text editor.
|