3 Getting Setup with R and RStudio

By: Christian Testa

We recommend before moving on, readers should download and install R and RStudio and the recommended R packages and dependencies.

3.0.1 Principles for Reproducible Workflow and Programming

Two principles underlie why R and RStudio have been chosen to use throughout these course materials. First, as a matter of equity, the use of free (both free to use and freely licensed) software such as R and RStudio removes financial and administrative barriers to engaging with our work and facilitates the inclusion of people from more diverse backgrounds in science. Second, data analysis and science should be open and transparent wherever and whenever possible, so as to promote the external reproduction, verification, and validation of findings.

The code examples in this book assume familiarity with the R programming language. You can still benefit from this book by reading the exposition without focusing on the code examples if you are not very familiar with R programming.

If you want to become more familiar with R programming, you may want to start by familiarizing yourself with R. The R for Data Science book, available free and online here: https://r4ds.had.co.nz/, which provides a free, online, accessible introduction.

3.1 Downloading and Installing R and RStudio

In order to follow along with these resources, you will need to have R and RStudio installed and setup. You can download R from https://www.r-project.org/. You can download RStudio from https://www.rstudio.com/.

3.2 Basic Features of R

Throughout this text, the example code given will depend on various R packages which are all available free and open-source, easily installed in R with the install.packages function.

To work through the case examples here, you will need to install at least the following packages:

# for data manipulation and visualization
install.packages("Hmisc") # mostly for the weighted quantile function
install.packages("fastDummies") # for creating "dummy"/ "indicator" variables

# for retrieving census data
install.packages("tidycensus") # note the special instructions below

# for mapping
install.packages("sf") # note the special instructions below

# for visualization/color palettes

# for multilevel modeling

# for spatial modeling
install.packages("INLA",repos=c(getOption("repos"),INLA="https://inla.r-inla-download.org/R/stable"), dep=TRUE)

3.2.1 Note about “compiling from source”

If, when installing these packages, R prompts you asking whether you would like to install these packages “from source,” you do not need to. Sometimes compiling from source can be more difficult than installing the pre-built packages from CRAN if there are compilation errors.

3.2.2 Tidycensus Special Instructions

You need to register for a Census API key as part of the setup procedures to use tidycensus.

See the instructions for installing on the tidycensus website.

Note that you only need to run the census_api_key("YOUR API KEY GOES HERE") command once.

3.2.3 sf Special Instructions

For the sf package, there are additional instructions here: https://r-spatial.github.io/sf/ which are OS (Mac, Windows, Linux) specific, which walk through getting setup with the gdal (or Geospatial Data Abstraction Library, a translator library for raster and vector geospatial data formats) which is a dependency of sf.

3.2.4 INLA Install Instructions

The code above installs the stable version of INLA. Should you ever need to upgrade your installation, instructions are online here: https://www.r-inla.org/download-install

3.3 References for Spatial Programming in R

As additional reference material to supplement the R programming code provided here, we would recommend: