If you use pandoc to convert user-contributed content in a web application, here are some things to keep in mind:
Although pandoc itself will not create or modify any files other than those you explicitly ask it create (with the exception of temporary files used in producing PDFs), a filter or custom writer could in principle do anything on your file system. Please audit filters and custom writers very carefully before using them.
If your application uses pandoc as a Haskell library (rather than shelling out to the executable), it is possible to use it in a mode that fully isolates pandoc from your file system, by running the pandoc operations in the PandocPure
monad. See the document Using the pandoc API for more details.
Pandoc’s parsers can exhibit pathological performance on some corner cases. It is wise to put any pandoc operations under a timeout, to avoid DOS attacks that exploit these issues. If you are using the pandoc executable, you can add the command line options +RTS -M512M -RTS
(for example) to limit the heap size to 512MB.
The HTML generated by pandoc is not guaranteed to be safe. If raw_html
is enabled for the Markdown input, users can inject arbitrary HTML. Even if raw_html
is disabled, users can include dangerous content in attributes for headings, spans, and code blocks. To be safe, you should run all the generated HTML through an HTML sanitizer.