3.1 Types of Variables
To illustrate some of the ideas presented in this chapter I’m going to use a toy
example with data from the characters of the Star Wars universe. You can actually
find the corresponding CSV file in the data/
folder of the book’s github
repository.
name gender height weight species jedi weapon
1 Anakin Skywalker male 1.88 84.0 human yes_jedi lightsaber
2 Padme Amidala female 1.65 45.0 human no_jedi unarmed
3 Luke Skywalker male 1.72 77.0 human yes_jedi lightsaber
4 Leia Organa female 1.50 49.0 human no_jedi blaster
5 Qui-Gon Jinn male 1.93 88.5 human yes_jedi lightsaber
6 Obi-Wan Kenobi male 1.82 77.0 human yes_jedi lightsaber
7 Han Solo male 1.80 80.0 human no_jedi blaster
8 Sheev Palpatine male 1.73 75.0 human no_jedi force-lightning
9 R2-D2 male 0.96 32.0 droid no_jedi unarmed
10 C-3PO male 1.67 75.0 droid no_jedi unarmed
11 Yoda male 0.66 17.0 yoda yes_jedi lightsaber
12 Darth Maul male 1.75 80.0 dathomirian no_jedi lightsaber
13 Dooku male 1.93 86.0 human yes_jedi lightsaber
14 Chewbacca male 2.28 112.0 wookiee no_jedi bowcaster
15 Jabba male 3.90 NA hutt no_jedi unarmed
16 Lando Calrissian male 1.78 79.0 human no_jedi blaster
17 Boba Fett male 1.83 78.0 human no_jedi blaster
18 Jango Fett male 1.83 79.0 human no_jedi blaster
19 Grievous male 2.16 159.0 kaleesh no_jedi slugthrower
20 Chief Chirpa male 1.00 50.0 ewok no_jedi spear
The table consists of 20 rows and 7 columns. The rows correspond to individuals and the columns correspond to variables. Although this data set is a toy example, it contains variables of different types commonly found in real data sets.
Interestingly, we can classify variables in a couple of different ways.
The most basic and usual way to classify variables is in two distinct types: quantitative variables and categorical (or qualitative) variables.
The variables height
and weight
are examples of quantitative variables
because their values represent quantities. That is, they can be measured
numerically on some sort of interval scale.
In turn, variables such as name
, gender
, species
, jedi
, and weapon
are categorical or qualitative variables because their values represent
categories (or qualities). More formally, they describe a quality of an
individual, and allows you to place an individual into a category or group,
such as male or female.
The division between categorical and quantitative variables is not the only one. Often, data scientists further classifiy categorical variables as nominal or ordinal. Likewise, quantitative variables can be classified as discrete or continuous. This next level of classification is chiefly based on the notion of scales of measurement of the variables.

Figure 3.2: Further classification of variables
3.1.1 Nominal Variable
A categorical variable is nominal when it results from naming or labeling
values that don’t have a natural order. An example of a nominal variable is
weapon
which has the following values:
[1] "blaster" "bowcaster" "force-lightning" "lightsaber"
[5] "slugthrower" "spear" "unarmed"
Can you order the categories in a “natural” way? Not really. The term nominal according the dictionary means “existing in name only”. Thus, nominal values are just that: names. There is no reason why blaster is better or greater than lightsaber. You could say that you prefer a blaster over a lightsaber but that’s a different variable: personal preference.
Other typical examples of nominal variables are:
the sex of a newborn child: e.g. female or male
the ethnicity of an individual: e.g. Native-American, African-American, Asian, White
ice cream flavors: e.g. chocolate, vanilla, strawberry
the numbers on the players’ jerseys of a soccer team: numbers used as identifiers
3.1.2 Ordinal Variable
A categorical variable is ordinal when it results from ordering values into a series of categories when no appropriate numerical scale is available. For example, consider a variable “usage frequency” measured with values never, sometimes, and always. In this case we can order the categories from less usage to more usage, or viceversa.
Some examples of ordinal variables are:
size of clothes: extra-small, small, medium, large, extra-large
college year: freshman, sophomore, junior, senior
spiciness: none, mild, moderate, very
jedis ranks: youngling, padawan, knight, master, and grand master
3.1.3 Discrete Variable
A quantitative variable is discrete when it results from counting. To be more precise, a discrete variable takes on zero or a positive integer value. Some examples of discrete variables are:
the number of male ewooks in a family with four children (0, 1, 2, 3, or 4).
the number of robots per Imperial Star Destroyer
the number of moons orbiting around a planet
3.1.4 Continuous Variable
A quantitative variable is continuous when it results from measuring. More technically, a continuous variable theoretically takes on an infinite number of possible values, however, its reported values are subject to the precision or accuracy of the measurement device. Some examples of continuous variables are:
- the height of an individual
- the weight of a robot
- the speed of a starship
3.1.5 Caveat
Keep in mind that not all variables fit neatly and unambiguously into one of the previous classes. For example, the age of an individual could be considered of a discrete variable when it gets reported in (whole) number of years. However, age could also be considered to be continuous when measured in a more granular scale: e.g. days, or hours, or seconds. Moreover, sometimes age is reported into ordered categories such as 0 to 5 years, 6 to 10, 11 to 15, and so on. These values would turn age into an ordinal variable.