R crash course: Writing functions

R crash course: Writing functions

As you know by now, R is all about functions. In the event that there isn’t one for the exact thing you want to do, you can even write your own! Writing your own functions is a very useful way to automate your work. Once defined, it’s easy to call new functions as often as you need. It’s a good habit to get into when programming with R — and with lots of other languages as well.

Defining a function uses another function simply called function(). Function names follow pretty much the same rules as variable names, so you can call them anything that would also be acceptable as a variable name.

Let’s try an easy example to see how function definitions work:

A function of questionable usefulness: It essentially does the same thing as print(). It takes an argument called x, and prints whatever you put as x to the console.

Theoretically, you can make your function take as many arguments as you want. Just write them in the parentheses of function(). You can call the arguments however you want, too. Also, your functions will probably often require more than one line. In that case, just put whatever you want your function to do in curly brackets {}. It will look somewhat like this:

Let’s mess with that one a bit! Run the following code line by line and try to guess what went wrong.

Possible errors while writing functions

Errors aren’t just a necessary evil in coding. By making mistakes, you get to know your programming language better and find out what works — and, of course, what doesn’t work. Let’s go through the errors one by one:

  • squareadd(3): You passed the function only one argument (3, which was attributed to the “x” argument) to work with when it expected two values, one for x and one for y.
  • squareadd(3,”two”): Now you passed the function two arguments, but one’s not a number. It’s a character, since it has quotes around it. But R can’t execute the function with a character. After all, what is 3^2 + “two” supposed to mean?
  • squareadd(3,two): No quotes this time in the second argument. Because the “y” argument is not in quotes and not a number, either, R assumes it’s a variable or some other object. Problem is: R can’t find the object called two anywhere
  • After you define the object two to be equal to 2, though, R does find a matching object to put as an argument. So this time around, squareadd(3,two) should return the number 11

After we change the function definition to include only the “x” argument, the errors we get change a little. Note that we there’s still a “y” in the function body.

  • squareadd2(3,2): Other way around this time. Your function expected only one argument, but got two.
  • squareadd2(3): You passed the correct number of arguments, but R can’t find anything to use for the y in the function body, neither inside the function nor in the global environment.
  • This is why, after you defined y to be equal to four in the global environment, squareadd2(3) works fine and will return 13 (since 3^2 + 4 = 13).

Scoping Rules in R

Some of the errors you’ll get, such as those in the last two lines, are due to something called the scoping rules of R. These rules define how R looks for the variables it needs to execute a function. It does that by looking through different environments — sub-spaces of your working environment that have their own variables and object definitions — in a certain order. There’s two basic types of scoping:

  • Lexical scoping: Looking for missing objects in the environment where the function was defined.
  • Dynamic scoping: Looking for missing objects in the environment where the function was called.

R uses lexical scoping. So if it doesn’t find the stuff it needs within the function (which, incidentally, has its own little environment), it goes on to look in the environment where the function was defined. In many cases, this will be the global environment, which is what you’re coding in if you’re not inside a specific function. If it doesn’t find what it needs there either, it will continue down the search list of environments. You can take a look at the list by typing search() into your console.

Let’s take a quick look at the difference between dynamic and lexical scoping. Look at the following code and try to guess its output. Execute it in RStudio and see if you’re right.

The output depends on the scoping rules your programming languages uses. As you just learned, R uses lexical scoping. So if you call check(), a is set to FALSE only on the function environment of check(). But since istrue() was defined in the global environment, where a is still equal to TRUE, it will print “that’s right!” to your console. If R used dynamic scoping, it would go with a <- FALSE, since that is accurate for the environment where istrue() was called.

You don’t have to worry too much about the specifics of scoping rules and environments when starting to code, but it’s a useful thing to keep in mind. There’s lots of good info on scoping, searching and environments in R on the web, as well as more tutorials on writing your own functions. We’ll be putting together some resources on our website soon, so stay tuned for that.

But for now — well done! That was a lot of new info to process. print() yourself a “Good job!” to the console before you go on and practice writing some more functions. We’re looking forward to your coding experiences!

Bonus round: Can you count how often the word “function” appears in this text? Guess right and win a complimentary function congratulating you on your newly acquired coding skills.


{Credits for the awesome featured image go to Phil Ninh}

Leave a reply

Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.