Skip to content

carloskiki/pulldown-latex

Repository files navigation

pulldown-latex

A pull parser for $\LaTeX$ parsing and mathml rendering.

Try it out!

This project is inspired KaTeX, Temml, MathJax, etc. It is actively maintained, and is in a stage of development where 95% of what KaTeX and the likes support is properly working and minimally tested. This software should be functional for most use cases. However, it is not recommended for large scale production use as more robust testing is required.

Rust Version

This crate requires Rust version 1.74.1 or higher.

Goals

Follow modern LaTeX principles: Ideally, this library should be mostly compatible with latex2e and amsmath. The term mostly is used here to refer to the mathematical commands exposed by these packages; typesetting prose is out of scope for this crate. Another consequence of this goal is that some plain-TeX commands that are deprecated (e.g., \atop, \over, etc.) are not supported by this crate.

Closely resemble conventional LaTeX: It is a goal for this crate to make efforts in generating aesthetic equations. This means that the mathml output may be tweaked to make it resemble what pdflatex, KaTeX or MathJax outputs.

Development Notes

To Test

  • All the things temml and katex test
  • Errors
  • Comments parsing
  • we would really benefit from having a fuzzer.
  • Cargo Mutants

TODO's/Known Bugs

  • raise and lower boxes.
  • \sideset
  • \mathop, \mathbin, etc.
  • Correctly use the accent attribute: https://w3c.github.io/mathml-core/#dfn-accent
  • Match the mathml API to pulldown-cmarks API.
  • Square bracket matrices do not have equal spacing on the left and the right in Chromium.
  • Italic numbers are not italic because they do not exist in unicode.

Unsupported Plain-TeX & LaTeX behavior

  • Changing catcodes of characters
  • \if* macros
  • ^^_ & ^^[0-9a-f][0-9a-f] as a way of specifying characters
  • Redefining active characters This library currently only supports default active characters, and hence does not allow for the definition of active characters.
  • Implicit characters as whitespace tokens As in the TeXbook p. 265, Knuth specifies that a space token stands for an explicit or implicit space. This library does not currently support implicit space tokens when a space token is required.
  • Use of internal values and parameters, such as registers, and things like \tolerance (See TeXbook p. 267 for a complete definition)
  • \magnification parameter & true sizes
  • Case insensitive keywords matching. According to TeXbook p. 265, keywords such as pt, em, mm, etc. are matched case insensitively (e.g., pT would match pt). This library does not support this behavior, as keywords must match exactly (i.e., em, mm, pt, etc.).
  • fil units TeX allows the use of fil(ll...) units, this library does not.
  • \outer specifier on definitions
  • \edef, we do not support pre-expansion of macros.
  • \csname & \endcsname
  • \begingroup and {, and \endgroup and } behave the same way; that is to say, \begingroup and \endgroup do not have the property of "keeping the same mode" (TeXbook p. 275), which only makes sense in text mode.
  • All vertical list manipulation commands. Things like \vskip, \vfil, \moveleft etc.
  • \hfil, \hfill
  • \over, \atop, and all deprecated "fraction like" control sequences.

Unsupported KaTeX/Temml Options

  • Macros preamble
  • Wrap
  • Left equation numbers
  • colorIsTextColor
  • ThrowOnError
  • maxSize
  • trust
  • \toggle groups

Miscellaneous References & Tools

Sources used during the development of this crate. Any reference in code comments refer to these links specifically.