Home

Reducing frictions in writing with R Markdown for html and pdf

I write my thesis in R Markdown, with bookdown. Each chapter is a standalone paper, with the pdf output in mind initially. When I assembled them together for the thesis, I hoped bookdown::gitbook should just work. In contrast to the nice pdf, the compiling for html complained bitterly: images not shown, ?? in cross-references, and etc. While fixing bits and pieces, I’ve learned several tips to lubricate this process.

Automatic switch between png and pdf for external figures with knitr.graphics.auto_pdf = TRUE

We let knitr to take care of the plotting device for R figures: pdf for latex and png for html, by default. However, we often rely on external figures to demonstrate conceptual models. We end up exporting a set of images with identical names but different extensions, ext-fig.pdf and ext-fig.png, in order to maintain high-quality images for pdf and html. To insert images, we use ![](), that inevitably lends itself to an if-else statement. No doubt Yihui has already provided an optimal solution to keep away from the sub-optimal if-else. It turns out knitr::include_graphics("ext-fig.png", auto_pdf = TRUE) does the trick to automatically switch between png and pdf. If you don’t want to specify auto_pdf = TRUE every time, turn on the option options("knitr.graphics.auto_pdf" = TRUE) up front.

Using include_graphics() in a knitr chunk allows us to adjust image width/height, add captions, arrange side-by-side figures. The option out.width = "80%" remains as is for html and is translated to out.width = "0.8\\linewidth" for latex.

```{r ext-graphics, fig.cap = "(ref:ext-graphics-cap)", out.width = "80%"}
knitr::include_graphics("ext-fig.png", auto_pdf = TRUE)
```

(ref:ext-graphics-cap) I prefer the figure caption outside the chunk.
(1) Special symbols, like `\@ref()`, don't need escaping. We have to do `fig.cap = \\@ref()` in a chunk.
(2) The chunk won't be re-evaluated, if we modify the caption.

Output conditional on formats with knitr::is_[html/latex]_output

We know knitr is a powerhouse that processes our chunks and documents in the background. As an R package, it also comes with many handy functions (hidden gems), such as include_[app/url/graphics]. The pair of is_[html/latex]_output are useful for producing contextual results due to different media. For example, the online welcome page for R for Data Science starts with “This is the website”. This page, however, doesn’t occur in the printed version, because “This is not a website”.

To handle this kind of disparity, we can apply the following trick. Inline R expressions put the html comment <!-- --> around the text if the output format is html. The text only shows up in pdf.

Coupled with the eval option, is_[html/latex]_output gives the control over whether to evaluate the chunk depending on the output formats.

Cross-reference with \@ref() instead of \ref{}

Scientific writing cannot avoid mentioning figures, tables, and equations. I already got used to typing *italic* over \textit{italic} and **bold** over \textbf{bold}. But when outputting html, the Rmd cross-reference system has bitten me a bit. Referring to figures and tables with \ref{} compiles with no problems for pdf, and it is naturally embedded into the muscle memory as a latex user. I was confronted with the initial surprise that it is not recognisable when switching to the html output. The recommended syntax for cross-references is \@ref() in Rmd, which both latex and html embrace with delights.