Gaston Sanchez

I'm an applied statistician, data scientist, and faculty member in the Department of Statistics, at UC Berkeley.

Among other things, I am very interested in:

  • multivariate methods for exploring, analyzing, and visualizing data in a context of multiple variables and high dimensionality.
  • the use of graphical displays to understand data with visualization.
  • topics about computational reproducibility and open science.
  • helping researchers and scientists analyze their data.

I am serious about Open Science and Open Knowledge; sincerely committed to the principle that everything I produce—tutorials, slides, teaching materials, software, data—should be immediately freely accessible for anyone to access, download, use, and extend upon it.

This website is my personal space where I try to declutter my ideas and put all my work in order (which is easier said than done). If you find any valuable resources here (which I'm pretty sure you will) please consider giving something back from my wishlist .



Some of the courses I've taught in the Department of Statistics, UC Berkeley.

  • Stat 2: Introduction to Statistics
  • Stat 20: Introduction to Probability and Statistics
  • Stat 131A: Introduction to Probability and Statistics for Life Scientists
  • Stat 133: Concepts in Computing with Data
  • Stat 154: Modern Statistical Prediction and Machine Learning
  • Stat 159: Reproducible and Collaborative Statistical Data Science
  • Stat 243: Introduction to Computational Statistics


I'm a passionate R user as well as developer and maintainer of several R packages. All the code is available in my github repositories.

  • plspm provides a toolkit exclusively dedicated to Partial Least Squares Path Modeling (PLS-PM) analysis.
  • plsdepot a set of tools for performing Partial Least Squares (PLS) analysis of one or two data tables.
  • pathmox is dedicated to the Pathmox approach for obtaining segmentation trees in Partial Least Squares Path Modeling (PLS-PM) analysis.
  • arcdiagram is a minimalist package to help you plot pretty arc diagrams in R.
  • colortools is designed to help users generate color schemes and color palettes.
  • matrixkit is an R package that provides a first aid kit for some matrix operations commonly used in multivariate data analysis methods.
  • turner provides a set of handy functions to turn vectors (and lists of vectors) into other indexed data structures.
  • tester provides human readable functions to test characteristics of some common R objects.
  • cointoss is a toy package with simple functions for simulating tossing a coin.


Slides of some of my talks.



Occasionally, I like to talk about data analysis, visualization, statistics, R and related stuff in my blog Data Analysis Visually EnfoRced.