Chapter 4 Transformations

In many situations you will need to transform and modify the raw values of your data.

Transformations are useful for many reasons, for example:

  • to change the scale of a variable from continuous to discrete
  • to linearize a variable that has a non-linear distribution
  • to make distributions more symmetric
  • to re-center the data (mean-center, or center by other reference value)
  • to stretch or compress the values (by standard deviation, by range, by eigenvalues)
  • to binarize or dummify a categorical variable

In this chapter, I will cover common transformations:

  • dummyfication
  • mean-center
  • standardization
  • logarithmic transformation
  • power transformation