Skip to main content

Hugo, Pandoc, Nix and MathML

·1140 words·6 mins

Augmenting Hugo through Nix’ and Pandoc’s versatility.

The new academic year will start in just over a week. It promises to be yet another year full of unknowns. I do hope, however, that a year and a half’s worth of experience teaching under changing conditions means the constant adapting will prove less stressful. In that case, I could finish writing down my thoughts on introductory programming courses. In anticipation, I now find myself dusting off my dormant blog, which I previously hosted on nullptr.club.

For this new site, I wanted to minimise my use of JavaScript. I started with eight lines to obfuscate my email address, which is fine. But then I added fifty more lines to make the bunny hop. I guess I failed. But I promise that is as far as I will go1!

Most of all I wanted to avoid relying on third party code. I do not want the added responsibility of dealing with the security implications. There was just one problem. At time of writing, there exists no universal method to embed mathematical formulas in webpages. The culprit? Blink, the rendering engine developed under the auspice of Google and used by the majority of browsers: Brave, Chrome, Edge, Opera and Vivaldi, among others.

MathML

Safari may get a bad rep for not (quickly) adopting features added to the living standard of HTML52 but it and Firefox have at least had support for the MathML specification since 2011. I find it somewhat ironic that company which develops the Noto fonts, has not yet implemented support for one of the most universal languages of all, understood by students, scientists and engineers all over the globe. A language that is undoubtedly used by a fair share of Google’s own employees.

As a consequence, authors tend to rely on third party JavaScript libraries such as MathJax or KaTeX that emulate (parts of) TeX’s math typesetting engine. Unfortunately, this approach conflicts with my goal to avoid third party JavaScript. How, now, should I solve this? Should I notify Edge and Chrome users that this site is best viewed in Safari or Firefox? Or should I take the easy route, add a couple of includes to each page and require JavaScript?

In the end, I decided to adopt MathML and serve MathJax to user agents built on Blink. The remainder of this post details how.

Hugo

This site is generated by Hugo, a static site generator written in Go. By default, it uses Goldmark to convert markdown into HTML. Goldmark has an extension which adds support for MathJax, but, alas, does not support MathML. However, it is possible to change the renderer by changing the markup option in the frontmatter.

One of the options is to use Pandoc as an external application for the conversion. This opens a whole range of possibilities, pandoc supports filters that can read and modify its internal representation plus it can be used to convert TeX-like formulas into MathML by passing it the --mathml flag on the command line.

Hugo chooses to pass the --mathjax flag instead and does provide any means to alter this behaviour nor is there a way to change which command is executed. What is left is to write a wrapper script that disguises itself as pandoc. But that requires the wrapped pandoc to take priority on my PATH which in turn will affect other applications.

Enter… Nix

Nix is, among many things, a declarative package manager built around the concept of derivations. It comes with a language, library and a default set of derivations that make it incredibly ease to add additional “packages”. Each installed package has a dedicated directory in Nix’ store. The binaries, manual pages and libraries are made available by placing symlinks to the store in the bin, man and lib directories of a user’s or system’s environment. Adding or removing packages creates a new generation of the environment which then replaces its predecessor.

The nix-shell utility applies this idea to development environments. It is like a language agnostic version of Python’s virtual environments. Below is a listing of a shell.nix, that provides both Hugo and the wrapped version of Pandoc.

{ pkgs ? import <nixpkgs> {} }:
let
  wrappedPandoc = pkgs.writeShellScriptBin "pandoc" ''
    exec ${pkgs.pandoc}/bin/pandoc --mathml
  '';
in
pkgs.mkShell {
  buildInputs = [
    pkgs.hugo
    wrappedPandoc
  ];
}

To enter the development environment, cd to the directory containing shell.nix and execute nix-shell. The effect of entering and exiting the development environment is best illustrated by a short demonstration.

site % which pandoc
/usr/local/bin/pandoc
site % nix-shell

[nix-shell:~/site]$ which pandoc
/nix/store/8cfniz8md36pfa28mrra8lr5pv7h56p4-pandoc/bin/pandoc
[nix-shell:~/site]$ exit
site % which pandoc
/usr/local/bin/pandoc

When started from the development environment, Hugo will use the wrapped version of Pandoc. Outside this ephemeral environment, software continues to use the version as shipped by its author.

Supporting Chrome cum suis

Next on the list was the conditional loading of MathJax. An option would be to inspect the user agent information, either at the server or client level. Both would be blunt instruments. Support for MathML might get added to future versions of Blink, user agent information can be spoofed, browser extensions could come to fill the functional void. Instead, I discovered this issue on GitHub containing a useful snippet to detect a browser’s MathML capabilities using a few lines of JavaScript. I combined this with a check for math tags to avoid the inclusion of MathJax when not needed. The result is shown below3.

// only load mathjax if present page contains math tags…
let n_math_els = document.getElementsByTagName("math").length;
if (n_math_els <= 0) {
  return;
}

// …and the user's webbrowser cannot render them natively
// code from https://github.com/mathjax/MathJax/issues/182
let hasMathML = false;
if (document.createElement) {
  var div = document.createElement("div");
  div.style.position = "absolute";
  div.style.top = div.style.left = 0;
  div.style.visibility = "hidden";
  div.style.width = div.style.height = "auto";
  div.style.fontFamily = "serif";
  div.style.lineheight = "normal";
  div.innerHTML = "<math><mfrac><mi>xx</mi><mi>yy</mi></mfrac></math>";
  document.body.appendChild(div);
  hasMathML = div.offsetHeight > div.offsetWidth;
  div.remove();
}

if (!hasMathML) {
  const mathjax_files = [
    "https://polyfill.io/v3/polyfill.min.js?features=es6",
    "https://cdn.jsdelivr.net/npm/mathjax@3/es5/mml-chtml.js",
  ];
  mathjax_files
    .map((url) => {
      let js = document.createElement("script");
      js.setAttribute("src", url);
      return js;
    })
    .forEach((el) => document.body.appendChild(el));
}

Examples

The next two paragraphs show the result. First is a “standalone” equation that occupies a paragraph of its own:

iΨ̇=HΨi\hbar{}\dot{\Psi}=H\Psi

This paragraph uses in line mathematical notation to define the converse of PQP \rightarrow Q as QPQ \rightarrow P, and its contrapositive as ¬Q¬P\lnot{}Q \rightarrow \lnot{}P.

Outlook

The method of overriding Hugo’s invocation of Pandoc presented here, opens the doors for further additions. Next, I intend to add support for including biblatex bibliographies.


  1. for now…↩︎

  2. See for instance, Safari is the new IE, Safari isn’t protecting the web, it’s killing it and Safari is Not the New IE, But….↩︎

  3. There are no integrity checks. There are no checksums listed on MathJax’s getting started instructions and would break the site eventually since the URL tracks the most recent version.↩︎