💾 Archived View for republic.circumlunar.space › users › johngodlee › posts › 2020-05-20-rmarkdown.g… captured on 2024-07-09 at 00:50:16. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-12-04)
-=-=-=-=-=-=-
DATE: 2020-05-20
AUTHOR: John L. Godlee
For a lab meeting we were tasked with "doing something in RMarkdown to present to the group". As I'm already familiar with RMarkdown so I thought I would try something slightly more advanced with the typesetting, then I took it too far. This is a breakdown of my discoveries. For the record, I don't like authoring documents with RMarkdown, I'd much rather use LaTex+R+Bash, and this exercise only served to solidify my feelings on the limitations of RMarkdown.
First thing I did was try to align all figures in a document automatically. One can align figure outputs from each code chunk using {r fig.align"center"}, but adding that to every chunk is a pain. To solve this, fig.align="center" can be added as a global option with :
knitr::opts_chunk$set(echo = TRUE, fig.align="center")
Next, I wanted to add references, which I did using pandoc-citeproc, which allows citation syntax like [@Janzen1970; @Tilman2014]. Apart from installing pandoc-citeproc, this was completely standard usage.
I wanted to add internal references as well, but RMarkdown doesn't support this by default, so I had to compile my document as the html_document2 type from the {bookdown} package.
I wanted to remove the number from the Preamble section of the document, and have section numbering start at the Introduction. I did this with # Preamble {-}.
I used MathJax to provide inline ($x+1$) and equation style ($\sum(x)$) math environments. In the meantime I learned that while MathJax and Tex syntax is very similar, they're not the same, with LaTeX offering a wider range of functions.
I used MathJax to provide a degree symbol with $^\circ$.
I defined a custom CSS template which removed the rounded borders on the code chunks, changed the font to something Serifed, changed the maximum document width of the page, and added a dot after the section number:
body { font-family: serif; font-size: 1.5em; padding: 30px; margin: 0 auto; max-width: 85%; } pre, code { background-color: #e0e0e0; border-radius: 0; overflow: hidden; white-space: pre-wrap; word-break: keep-all; } .highlighter-rouge .highlight { background: #e0e0e0; } .header-section-number::after { content: ". "; }
This is then referenced in the YAML header:
output: html_document2: css: css/style.css
I wanted the date to automatically update to today's date on compilation, and to have the date in a nice format, which involved embedding R code in the YAML header:
date: '`r format(Sys.Date(), "%d %B, %Y")`'
I used inline R code to reference statistics in the code chunks so that the text doesn't have to be updated manually:
`r conspec_a`
I wrote a custom knit option to output to both HTML and PDF with one Knit command. Normally I would just do this in a shell script with two render commands, but as a challenge, I did it in the RMarkdown document:
knit: (function(inputFile, encoding) { rmarkdown::render(inputFile, encoding = encoding, output_dir = ".", output_format = c("bookdown::html_document2", "pdf_document")) })
I altered the default pandoc TeX template so that the PDF title is not on a separate page to the rest of the document, and I reduced the blank space above the title. I also adjusted the LaTeX geometry arguments in the yaml header with geometry: "left=3cm,right=3cm,top=2cm,bottom=2cm".
At this point when I compiled to PDF I kept getting an error during LaTeX compilation, something about \includegraphics being unrecognised, which is obviously silly because it's a base function in LaTeX. After some searching around[1], I found that adding graphics: yes to the YAML header solved the problem. Seems very hacky and unsatisfying.
1: https://github.com/rstudio/rmarkdown/issues/325
Once images started appearing in the PDF, I found that the floating of images by LaTeX didn't mix well with the code chunks. The only sensible strategy is to have images occur directly after the code chunks, so I had to add extra_dependencies: ["float"] to the YAML header, then add knitr::opts_chunk$set(fig.pos = "H", out.extra = "") as a global option to set the position of all figures in PDF to [H].
Finally, the line breaking of code in PDFs as standard by RMarkdown is abyssmal. The only way I managed to get decent wrapping on both code and code verbatim output, was to add a custom post-hook to Knitr, with this function:
# Function for robust line wrapping of output ##' Normally this chunk would be `echoFALSE`, ##' but for demonstration I've left it in. library(knitr) hook_output = knit_hooks$get('output') knit_hooks$set(output = function(x, options) { # this hook is used only when the linewidth option is not NULL if (!is.null(n <- options$linewidth)) { x = knitr:::split_lines(x) # any lines wider than n should be wrapped if (any(nchar(x) > n)) x = strwrap(x, width = n) x = paste(x, collapse = '\n') } hook_output(x, options) })
Then add a global option for the line width cutoff: knitr::opts_chunk$set(tidy.opts=list(width.cutoff=60),tidy=TRUE).
Only after all this strife did my document look anything like professional. I guess most of this stuff is for PDF, the HTML version was much easier to get looking nice.