Header identifiers in HTML, LaTeX, and ConTeXt
Pandoc extension.
Each header element in pandoc’s HTML and ConTeXt output is given a unique identifier. This identifier is based on the text of the header. To derive the identifier from the header text,
- Remove all formatting, links, etc.
- Remove all punctuation, except underscores, hyphens, and periods.
- Replace all spaces and newlines with hyphens.
- Convert all alphabetic characters to lowercase.
- Remove everything up to the first letter (identifiers may not begin with a number or punctuation mark).
- If nothing is left after this, use the identifier
section
.
Thus, for example,
Header identifiers in HTML |
header-identifiers-in-html |
Dogs?â€'in my house? |
dogs--in-my-house |
HTML, S5, or RTF? |
html-s5-or-rtf |
3. Applications |
applications |
33 |
section |
These rules should, in most cases, allow one to determine the identifier from the header text. The exception is when several headers have the same text; in this case, the first will get an identifier as described above; the second will get the same identifier with -1
appended; the third with -2
; and so on.
These identifiers are used to provide link targets in the table of contents generated by the --toc|--table-of-contents
option. They also make it easy to provide links from one section of a document to another. A link to this section, for example, might look like this:
See the section on
[header identifiers](#header-identifiers-in-html).
Note, however, that this method of providing links to sections works only in HTML, LaTeX, and ConTeXt formats.
If the --section-divs
option is specified, then each section will be wrapped in a div
(or a section
, if --html5
was specified), and the identifier will be attached to the enclosing <div>
(or <section>
) tag rather than the header itself. This allows entire sections to be manipulated using javascript or treated differently in CSS.