Links to the other two parts of the workshop:
ggplot2
notes: https://verticalmeadows.github.io/ggplot_basics.html
Introduction to ggmap
: https://verticalmeadows.github.io/ggmap_basics.html
One kind out of many.
Let’s go back to RStudio and open a new R Markdown document.
From https://oliviergimenez.github.io/intro_rmarkdown/#13
An R Markdown script contains three different parts
All of them use different programming languages on their own, but you don’t have to know a lot about them to get going with R Markdown.
Made with the YAML language.
Essentially here is where you set default values for the looks of the whole document.
Markdown is also a language on its own, but it can serve as glue between different other languages that result in the document created. Let’s list what it can do, cover how to control the looks.
This command inserts the image that you see now
This is a paragraph. You need two spaces at the end to end a paragraph. (I didn’t put two spaces at the end of ‘This is a paragraph.’)
Code in line 3+2
But it only looks like a code.
Executable code in line 5 The code is actually executed when r is inserted.
Equations in line with LaTex syntax: \(D_{ij}^{ling}=\frac{\sum D_Q}{n}\) Make sure not to put a space within the equation
“Everything is related to everything else, but near things are more related than distant things”
Tobler (1970)
A Simple Table | Has headers |
---|---|
It is inconvenient | And doesn’t look good in the script |
But there is a solution | !!! |
Writing [link](https://verticalmeadows.github.io/ggplot_basics.html)
without back ticks produces a link
Besides <https://verticalmeadows.github.io/ggplot_basics.html>
produces a link as well, https://verticalmeadows.github.io/ggplot_basics.html highlighting the actual text of the hyperlink in the script more.
Without anything added, web links are still recognised. https://verticalmeadows.github.io/ggplot_basics.html
Colouring text is also possible, but it needs a bit more tweaking.
Two tabs in a new line convert it into a highlighted text.
<!--This text is commented out, thus not visible below -->
Code chunks are named, easily navigable and collapsible in Rstudio.
You can document your code chunks in the Markdown content.
As you see, it is much more easily readable than the comments in code.
# This is a comment in the code chunk
print("This is a code chunk above, and I'm the output.")
## [1] "This is a code chunk above, and I'm the output."
Let’s use the next chunk to write a custom function to colorize text
# https://bookdown.org/yihui/rmarkdown-cookbook/font-color.html
colorize <- function(x, color) {
if (knitr::is_latex_output()) {
sprintf("\\textcolor{%s}{%s}", color, x)
} else if (knitr::is_html_output()) {
sprintf("<span style='color: %s;'>%s</span>", color,
x)
} else x
}
r colorize("some words in red", "red")
using only one “`” around them produces
some words in red
When you write your R
code as an Rmd
, the code is automatically wrapped in RStudio, unlike in usual r.
files.
By default, markdown includes both the code, and the output
You can decide:
You can specify the size (in inches) of the output figures in the code chunks’ arguments. {r easyhist, echo=FALSE, fig.width=3, fig.height=5}
Notice that we don’t see the code here, only the output.
message = FALSE
prevents messages from appearing in output (such as the messages a library() command produces). Besides, warning=FALSE
mutes warnings, such as certain data points missing from a plot.
Use knitr::opts_chunk$set(echo = FALSE)
to make your choice default if you use them a lot.
You may overwrite the default for each chunk
Refer to all chunk options: https://yihui.name/knitr/options/
Plot equations as standalones in the markdown content:
\[E=mc^{2}\]
Let’s plot good looking tables called kables.
library(tidyverse)
sdatsToPlot <- read.csv("sdats_99.csv", header = T, stringsAsFactors = F)
knitr::kable(sdatsToPlot[1:5,6:8], format = "html", caption = "A kable table")
Landing_Reisezeit | Landing_Herkunft_Vater | Landing_Herkunft_Mutter |
---|---|---|
30’ | Brig | Brig |
NA | Simplon Dorf | Leukerbad |
2 | Engelberg | Engelberg |
1h | Sumiswald | Wasen i.E. |
20min | Stans | Lungern |
More options at https://bookdown.org/yihui/rmarkdown-cookbook/kable.html
kableExtra
is yet another extension to its capabilities: https://bookdown.org/yihui/rmarkdown-cookbook/kableextra.html
kables
can do more, but the package DT
provides even nicer tables.
It is an R interface to the JavaScript library DataTables.
One can even make these datatables
- searchable
- filterable by coloumns
- editable
- manipulate its appearance
- rename rows and coloumns for the display
- add captions
- sketch a custom table container using htmltools
Discover its main reference https://rstudio.github.io/DT/
library(DT)
options(DT.options = list(pageLength = 10))
datatable(sdatsToPlot[1:30,25:27]) %>%
formatRound(1:ncol(sdatsToPlot[1:30,25:27]), digits=4)
Check out a real plot.
lawyer_data<-read.csv("lawyer_data.csv",header=T, stringsAsFactors = T)
#Inspect data to see factors
#str(lawyer_data)
#names(lawyer_data)
#Load libraries for building CI tress and RFs
library(party)
library(randomForestSRC)
library(ggRandomForests)
#Start with CI tree
lawyer.ct =
ctree(eval_answer_overall~question+
speaker+
accent+
quality+
mcpr3+
P_gender+
P_age+
P_year_in_law+
P_region+
P_parentsOcc,
data=lawyer_data)
lawyer.ct
##
## Conditional inference tree with 3 terminal nodes
##
## Response: eval_answer_overall
## Inputs: question, speaker, accent, quality, mcpr3, P_gender, P_age, P_year_in_law, P_region, P_parentsOcc
## Number of observations: 549
##
## 1) quality == {high}; criterion = 1, statistic = 162.943
## 2) question == {Q1, Q10, Q2, Q4, Q5, Q6, Q7}; criterion = 0.995, statistic = 27.728
## 3)* weights = 217
## 2) question == {Q3, Q9}
## 4)* weights = 54
## 1) quality == {low}
## 5)* weights = 278
plot(lawyer.ct)
#par(mar = c(4, 4, .1, .1))
plot(cars$speed, cars$dist)
plot(mpg ~ hp, data = mtcars, pch = 19)
fig.show
is the key here. If ="hide"
, knitr will generate the plots created in the chunk, but not include them in the final document. If fig.show="hold"
’, knitr will delay displaying the plots created by the chunk until the end of the chunk. If fig.show="animate"
, knitr will combine all of the plots created by the chunk into an animation!
What can be produced: https://oliviergimenez.github.io/intro_rmarkdown/#47
If we produce a html, we can give it a custom .css
stylesheet (Cascading Stylesheet). https://bootswatch.com/
Further themes: https://www.datadreaming.org/post/r-markdown-theme-gallery/
There are also LaTeX and Word templates.
Managing the bibliography: https://oliviergimenez.github.io/intro_rmarkdown/#52
R Markdown is useful for many other things beyond documents. Check out the gallery: https://rmarkdown.rstudio.com/gallery.html
These notes are mainly based on:
https://oliviergimenez.github.io/intro_rmarkdown/
https://yongfu.name/2019-fju-rmd-talk/slide/#1
Special Thanks to Erez Levon for some of the data.
Getting started: https://www.dataquest.io/blog/r-markdown-guide-cheatsheet/
Reference guide for R Markdown: https://rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf
Bookdown: Writing professional technical documents with R Markdown: https://bookdown.org/yihui/bookdown/
R Markdown Cookbook: https://bookdown.org/yihui/rmarkdown-cookbook/
Add citations and cross-references: https://www.earthdatascience.org/courses/earth-analytics/document-your-science/add-citations-to-rmarkdown-report/
Advanced Image Processing for your html output, with package ‘magick’: https://cran.r-project.org/web/packages/magick/vignettes/intro.html
They appear at the and of the code based on the bibliography file provided.
Tobler, Waldo R. 1970. “A computer movie simulating urban growth in the Detroit region.” Economic Geography 46 (2): 234–40.