R crash course: Workspace, packages and data import

R crash course: Workspace, packages and data import

In this crash course section, we’ll talk about importing all sorts of data into R and installing fancy new packages. Also, we’ll learn to know our way around the workspace.

Your workspace in R is like the desk you work at. It’s where all the data, defined variables and other objects you’re currently working with are stored. Like with a desk, you might want to clean it every once in a while and throw out stuff you don’t need any more. There’s a few useful commands to help you do that. Take a look and try them out:

by the way, the function remove() does the same thing as rm(), the latter is just a little nicer to write.

The working directory

Aside from your workspace, there’s also your working directory. It’s where R looks for files and writes them to per default. So setting your working directory to be the place where you want your files to be can save you a lot of writing. There’s two useful functions for that:

There’s a lot of functions, like setwd(), that require you to specify a file path. Remember to always put them in quotes like character strings. Also, you’ll have to use a slash (/) instead of the usual backslash (\), since the backslash is serves other specific purposes in R.

Importing different data formats

If you want to work with data in R, the first hurdle is to load the data into your workspace. How that works depends on the data format. I prefer to convert everything to .csv before uploading it into R because R can handle .csv files well. But you might not have the luxury of having a .csv at hand, so it’s useful to know how to load other file types. Try this with any dataset you want!

Those are the standard import functions used in R. Depending on the quality of the data you’re working with (and chances are it’s not always the highest), you might have to fiddle with the function settings a little to tell R exactly how to read your files. If the regular help page is not enough, R forums like Quick-R are always worth looking at.

Although these are the basic functions, R can read data of almost any type. If a function isn’t available in the basic packages, chances are there will be a package for that. Just google “read yx files R”, you’ll probably stumble upon something.

Installing packages

One of the great things about R being open source is the abundance of extension packages available for free online. They’re created to make different aspects of R better, or add to its capabilities. There’s packages like “xlsx”, “jsonlite” or “twitteR” that help you import data from a diversity of sources. Packages like “plyr” and “dplyr” can make it easier to manipulate large datasets, and “ggplot” is a wonderful alternative to Rs built-in graphics functions.

To install a package, simply type

into your console. Once the package is installed, you’ll have to load it into R. You’ll have to do that every time you open a new session, since R only loads the basic packages per default in each session. There’s two functions that load packages into R:

The two do almost the same thing. If you use them in a function though (we’ll show you how to write your own functions soon), there’s a difference in what happens if something goes wrong.

If library() can’t find the package you specify, it will stop the function execution and throw out an error saying “there is no package called ‘nonexistent'”. require(), on the other hand, would only throw out a warning alerting you of the missing package. Other than errors, warnings won’t stop the execution of a function, just alert you that something might not be going the way you want it to.

Wrapping your head around the functionality of new packages might be a bit difficult at first, but usually, the package creators provide useful reference to get you started.

The {installr} package

Lastly, we’d like to point you to a useful package. Test your installing skills on the installr package. It provides a useful way to easily keep your R version up to date. Once installed and loaded, use the updateR() function to automatically check for new R updates, install them and update all your packages. If you run it from RStudio, the function will ask you to switch to the built-in R console since it works best there. Working with the latest version of R ensures that everything runs smoothly. So if you get inexplicable errors, first check if your R needs updating.

That’s all from us for today. It was a lot already. Good job!


{Credits for the awesome featured image go to Phil Ninh}

Comments ( 3 )

  1. R exercise: Analysing data – Journocode
    […] the internet – or in hidden corners of your hard drive. Remember to save your data in your working directory to save yourself some unneccessary […]
  2. Project: Visualizing WhatsApp chat logs - Part 1: Cleaning the data | Journocode
    […] Now that we’ve got a proper .csv file, we can start cleaning it in R. First, read in the file and save it as a variable. For more info on data import in R, check our our previous tutorial. […]
  3. Project: Visualizing WhatsApp chat logs - Part 2: Visualization | Journocode
    […] We could, theoretically, use the built-in R graphics package. But there’s other ways to make much more aesthetically pleasing graphics in R. We’ll use a package named ggplot2 here. One way to install it is to use  require("ggplot2") . For more info on installing packages, check out our tutorial. […]

Leave a reply

Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.