3 Formatting Text and Numbers

3.1 Introduction

A common task when working with character strings involves printing and displaying them on the screen or on a file. In this chapter you will learn about the different functions and options in R to print strings in a wide variety of common—and not so common—formats.

3.2 Printing and formatting

R provides a series of functions for printing strings. Some of the printing functions are useful when creating print methods for programmed objects’ classes. Other functions are useful for printing output either in the R console or to a given file. In this chapter we will describe the following print-related functions:

Function Description
print() generic printing
noquote() print with no quotes
cat() concatenation
format() special formats
toString() convert to string
sprintf() C-style printing

3.3 Generic printing

The workhorse printing function in R is print(). As its names indicates, this function prints its argument on the R console:

# text string
my_string <- "programming with data is fun"

print(my_string)
#> [1] "programming with data is fun"

To be more precise, print() is a generic function, which means that you should use this function when creating printing methods for programmed classes.

As you can see from the previous example, print() displays text in quoted form by default. If you want to print character strings with no quotes you can set the argument quote = FALSE

# print without quotes
print(my_string, quote = FALSE)
#> [1] programming with data is fun

3.3.1 When to use print()?

When you type the name of an obbject in the R console, R calls the corresponding print method associated to the class of the object. If the object is a "data.frame", then R will dispatch the method print.data.frame and display the output on screen accordingly.

Most of the times you don’t really need to invoke print(). Usually, simply typing the name of the object will suffice. So when do you actually call print()? You use print() when your code is inside an R expression (i.e. code inside curly braces { }) and you want to see the results of one or more computational steps. Typical examples that require an explicit call to print() is when you are interested in looking at some value within a loop, or a conditional structure.

Consider the following dummy for loop. It iterates five times, each time adding 1 to the value of the iterator i:

for (i in 1:5) {
  i + 1
}

The above code works and R executes the additions, but nothing is displayed on the console. This is because the command i + 1 forms part of an R expression, that is, it is within the braces { }. To be able to see the actual computations you should call print() like so:

for (i in 1:5) {
  print(i + 1)
}
#> [1] 2
#> [1] 3
#> [1] 4
#> [1] 5
#> [1] 6

3.4 Concatenate and print with cat()

Another very useful function is cat() which allows you to concatenate objects and print them either on screen or to a file. Its usage has the following structure:

cat(..., file = "", sep = " ", fill = FALSE, labels = NULL, append = FALSE)

The argument ... implies that cat() accepts several types of R objects (typically vectors). However, when we pass numeric and/or complex vectors, they are automatically converted to character strings by cat(). By default, the strings are concatenated with a space character as separator. This can be modified with the sep argument.

If you use cat() with only one single string, you get a similar (although not identical) result as noquote():

# simply print with 'cat()'
cat(my_string)
#> programming with data is fun

As you can see, cat() prints its arguments without quotes. In essence, cat() simply displays its content (on screen or in a file). Compared to noquote(), cat() does not print the numeric line indicator ([1] in this case).

The usefulness of cat() is when we have two or more strings that we want to concatenate:

# concatenate and print
cat(my_string, "with R")
#> programming with data is fun with R

You can use the argument sep to indicate a chacracter vector that will be included to separate the concatenated elements:

# especifying 'sep'
cat(my_string, "with R", sep=" =) ")
#> programming with data is fun =) with R

# another example
cat(1:10, sep = "-")
#> 1-2-3-4-5-6-7-8-9-10

When we pass vectors to cat(), each of the elements are treated as though they were separate arguments:

# first four months
cat(month.name[1:4], sep = " ")
#> January February March April

# first four months
cat(month.name[1:4], sep = "-")
#> January-February-March-April

# first four months
cat(month.name[1:4], sep = "")
#> JanuaryFebruaryMarchApril

The argument fill allows us to break long strings; this is achieved when we specify the string width with an integer number:

# fill = 30
cat("Loooooooooong strings", "can be displayed", "in a nice format", 
    "by using the 'fill' argument", fill = 30)
#> Loooooooooong strings 
#> can be displayed 
#> in a nice format 
#> by using the 'fill' argument

Last but not least, we can specify a file output in cat(). For instance, let’s suppose that we want to save the output in the file output.txt located in our working directory. This is done by specifying the file argument:

# cat with output in a given file
cat(my_string, "with R", file = "output.txt")

3.5 Encoding strings with format()

The function format() allows you to format an R object for pretty printing. Essentially, format() treats the elements of a vector as character strings using a common format. This is especially useful when printing numbers and quantities under different formats.

# default usage
format(13.7)
#> [1] "13.7"

# another example
format(13.12345678)
#> [1] "13.1"

Some useful arguments used in format():

  • width the (minimum) width of strings produced
  • trim if set to TRUE there is no padding with spaces
  • justify controls how padding takes place for strings. Takes the values "left", "right", "centre", "none"

For controling the printing of numbers, use these arguments:

  • digits The number of digits to the right of the decimal place.
  • scientific use TRUE for scientific notation, FALSE for standard notation
# use of 'nsmall'
format(13.7, nsmall = 3)
#> [1] "13.700"

# use of 'digits'
format(c(6.0, 13.1), digits = 2)
#> [1] " 6" "13"

# use of 'digits' and 'nsmall'
format(c(6.0, 13.1), digits = 2, nsmall = 1)
#> [1] " 6.0" "13.1"

By default, format() pads the strings with spaces so that they all have the same length.

# justify options
format(c("A", "BB", "CCC"), width = 5, justify = "centre")
#> [1] "  A  " " BB  " " CCC "

format(c("A", "BB", "CCC"), width = 5, justify = "left")
#> [1] "A    " "BB   " "CCC  "

format(c("A", "BB", "CCC"), width = 5, justify = "right")
#> [1] "    A" "   BB" "  CCC"

format(c("A", "BB", "CCC"), width = 5, justify = "none")
#> [1] "A"   "BB"  "CCC"

# digits
format(1/1:5, digits = 2)
#> [1] "1.00" "0.50" "0.33" "0.25" "0.20"

# use of 'digits', widths and justify
format(format(1/1:5, digits = 2), width = 6, justify = "c")
#> [1] " 1.00 " " 0.50 " " 0.33 " " 0.25 " " 0.20 "

For printing large quantities with a sequenced format we can use the arguments big.mark or big.interval. For instance, here is how we can print a number with sequences separated by a comma ","

# big.mark
format(123456789, big.mark = ",")
#> [1] "1.23e+08"

3.6 Exercises

  1. Why do we say that print() is not really one function but a family of functions?

  2. Mention three differences between print() and cat().

  3. What happens when you pass a matrix to cat()? For instance:

m <- matrix(1:12, nrow = 3, ncol = 4)
cat(m)
  1. What happens when you pass a data frame object to cat()? For instance:
dfr <- data.frame(a = 1, b = 2, c = 3)
cat(dfr)

Make a donation

If you find this resource useful, please consider making a one-time donation in any amount. Your support really matters.