💾 Archived View for republic.circumlunar.space › users › johngodlee › posts › 2020-05-20-rmarkdown.g… captured on 2023-09-28 at 16:20:37. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2021-12-04)

-=-=-=-=-=-=-

Adventures in tweaking RMarkdown

DATE: 2020-05-20

AUTHOR: John L. Godlee

For a lab meeting we were tasked with "doing something in RMarkdown to present to the group". As I'm already familiar with RMarkdown so I thought I would try something slightly more advanced with the typesetting, then I took it too far. This is a breakdown of my discoveries. For the record, I don't like authoring documents with RMarkdown, I'd much rather use LaTex+R+Bash, and this exercise only served to solidify my feelings on the limitations of RMarkdown.

First thing I did was try to align all figures in a document automatically. One can align figure outputs from each code chunk using {r fig.align"center"}, but adding that to every chunk is a pain. To solve this, fig.align="center" can be added as a global option with :

knitr::opts_chunk$set(echo = TRUE, fig.align="center")

Next, I wanted to add references, which I did using pandoc-citeproc, which allows citation syntax like [@Janzen1970; @Tilman2014]. Apart from installing pandoc-citeproc, this was completely standard usage.

I wanted to add internal references as well, but RMarkdown doesn't support this by default, so I had to compile my document as the html_document2 type from the {bookdown} package.

I wanted to remove the number from the Preamble section of the document, and have section numbering start at the Introduction. I did this with # Preamble {-}.

I used MathJax to provide inline ($x+1$) and equation style ($\sum(x)$) math environments. In the meantime I learned that while MathJax and Tex syntax is very similar, they're not the same, with LaTeX offering a wider range of functions.

I used MathJax to provide a degree symbol with $^\circ$.

I defined a custom CSS template which removed the rounded borders on the code chunks, changed the font to something Serifed, changed the maximum document width of the page, and added a dot after the section number:

body {
    font-family: serif;
    font-size: 1.5em;
    padding: 30px;
    margin: 0 auto;
    max-width: 85%;
}

pre, code {
    background-color: #e0e0e0;
    border-radius: 0;
    overflow: hidden;
    white-space: pre-wrap;
    word-break: keep-all;
}

.highlighter-rouge .highlight {
    background: #e0e0e0;
}

.header-section-number::after {
  content: ". ";
}

This is then referenced in the YAML header:

output:
  html_document2: 
    css: css/style.css

I wanted the date to automatically update to today's date on compilation, and to have the date in a nice format, which involved embedding R code in the YAML header:

date: '`r format(Sys.Date(), "%d %B, %Y")`'

I used inline R code to reference statistics in the code chunks so that the text doesn't have to be updated manually:

`r conspec_a`

I wrote a custom knit option to output to both HTML and PDF with one Knit command. Normally I would just do this in a shell script with two render commands, but as a challenge, I did it in the RMarkdown document:

knit: (function(inputFile, encoding) {
  rmarkdown::render(inputFile, encoding = encoding,
  output_dir = ".", output_format = c("bookdown::html_document2", "pdf_document")) })

I altered the default pandoc TeX template so that the PDF title is not on a separate page to the rest of the document, and I reduced the blank space above the title. I also adjusted the LaTeX geometry arguments in the yaml header with geometry: "left=3cm,right=3cm,top=2cm,bottom=2cm".

At this point when I compiled to PDF I kept getting an error during LaTeX compilation, something about \includegraphics being unrecognised, which is obviously silly because it's a base function in LaTeX. After some searching around[1], I found that adding graphics: yes to the YAML header solved the problem. Seems very hacky and unsatisfying.

1: https://github.com/rstudio/rmarkdown/issues/325

Once images started appearing in the PDF, I found that the floating of images by LaTeX didn't mix well with the code chunks. The only sensible strategy is to have images occur directly after the code chunks, so I had to add extra_dependencies: ["float"] to the YAML header, then add knitr::opts_chunk$set(fig.pos = "H", out.extra = "") as a global option to set the position of all figures in PDF to [H].

Finally, the line breaking of code in PDFs as standard by RMarkdown is abyssmal. The only way I managed to get decent wrapping on both code and code verbatim output, was to add a custom post-hook to Knitr, with this function:

# Function for robust line wrapping of output
##' Normally this chunk would be `echoFALSE`, 
##' but for demonstration I've left it in.
library(knitr)
hook_output = knit_hooks$get('output')
knit_hooks$set(output = function(x, options) {
  # this hook is used only when the linewidth option is not NULL
  if (!is.null(n <- options$linewidth)) {
    x = knitr:::split_lines(x)
    # any lines wider than n should be wrapped
    if (any(nchar(x) > n)) x = strwrap(x, width = n)
    x = paste(x, collapse = '\n')
  }
  hook_output(x, options)
})

Then add a global option for the line width cutoff: knitr::opts_chunk$set(tidy.opts=list(width.cutoff=60),tidy=TRUE).

Only after all this strife did my document look anything like professional. I guess most of this stuff is for PDF, the HTML version was much easier to get looking nice.

The PDF version is here

The HTML verion is here