Gaston Sanchez

Data Science and Statistics Educator


About


I'm a lecturer in the Department of Statistics, at the University of California, Berkeley. Among other things I ...

  • enjoy authoring materials (e.g. books, tutorials, slides) for teaching purposes,

  • love using graphical displays to understand data with visualization,

  • cherish multivariate methods for exploring, analyzing, and visualizing data in a context of multiple variables and high dimensionality,

  • care about computational reproducibility topics and open science,

  • and like helping researchers and scientists analyze their data.


Give Now

If you find any value and usefulness in the resources of this site, please consider making a one-time donation in any amount. Your support really matters.

Alternatively, you can also help me with my Amazon wishlist.

Teaching

Some of the courses I've taught at UC Berkeley


Stat 33A

Introduction to
Programming in R

Stat 33B

Introduction to Advanced
Programming in R

Stat 133

Concepts in Computing
with Data

Stat 2

Introduction to
Statistics

Stat 20

Introduction to
Probability and Statistics

Stat 131A

Introduction to
Probability and Statistics

Stat 151A

Linear
Modeling

Stat 154

Modern Statistical Prediction
and Machine Learning

Stat 159

Reproducible and Collaborative
Statistical Data Science

Stat 243

Introduction to
Computational Statistics

Books




Introductory text about concepts of Statistical Learning, covering some of the common supervised as well as unsupervised methods (work in progress).


This book aims to provide a comprehensive introduction to Principal Component Analysis (PCA) for Data Science.




The ultimate goal of this book is to teach you how to create a relatively simple R package based on the so-called S3 classes.


This book aims to help you get started with manipulating strings with R. It covers useful functions in packages "base" and "stringr", printing and formatting characters, regular expressions, and other tricks.




This book provides a hands-on introduction to Partial Least Squares Path Modeling (PLS-PM) using the R package "plspm".


What we know today as Partial Least Squares (PLS) is the result of a long period of evolution, with a vast range of methods and techniques proposed since the late 1960s / early 1970s. This book narrates the story behind the origins, development, and evolution of PLS methods.

Data Visualization


Machine and Statistical Learning


Data Technologies and R


Other Tools

Software


I'm a passionate R user, and in a not so distant past, I was an active developer and maintainer of several R packages. All the code is available in my github repositories.

plspm provides a toolkit exclusively dedicated to Partial Least Squares Path Modeling (PLS-PM) analysis.

pathmox is dedicated to the Pathmox approach for obtaining segmentation trees in PLS-PM analysis.

plsdepot provides a set of tools for performing Partial Least Squares (PLS) analysis of one or two data tables.


arcdiagram is a minimalist package to help you plot pretty arc diagrams in R.

colortools is designed to help users generate color schemes and color palettes.

pathdiagram provides simple functions to draw basic PLS path diagrams in R.


cointoss is a toy package with simple functions for simulating tossing a coin.

dieroller is a toy package with simple functions for simulating rolling a die.

binomial is a toy package with simple functions for computing binomial probabilities.


matrixkit is an R package that provides a first aid kit for some matrix operations commonly used in multivariate data analysis methods.

turner provides a set of handy functions to turn vectors (and lists of vectors) into other indexed data structures.

tester provides human readable functions to test characteristics of some common R objects.



Experiments


Rtist: weird but beautiful random paintings.

Got Plot: tiny collection of polished charts in R.

Star Wars Arc Diagram: visualizing Star Wars movie scripts.


genbiovis: deprecated experiment for visualizing titles of genetics & biology papers

Mining twitter with R: deprecated experiment that keeps catching people's attention.

Blog


Some years ago I used to talk about data analysis, visualization, statistics, R and related stuff in my blog Data Analysis Visually EnfoRced (unfortunately, I haven't had the time to do that anymore).



Beyond


Fundamentos Teoricos de Maniobras con Cuerdas

In a parallel life I've been a rope rigging enthusiast (yes, seriously). Such has been my fascination around these activities that I even wrote a book in Spanish about the theoretical fundamentals of Rope Techniques.



Poemario

Once in a blue moon, my alter ego feels compelled to write—mostly in my mother tongue—what to me seems, feels and sounds like poetry-ish. I call them Poemario, a random collection of personal poems in Spanish.



Something Unique About You

A curated collection of what students have told me when I have asked them to "tell me something unique about you".



Utility Poles

Photo album of an assortment of utility poles (transmission poles, telephone poles, power poles, etc).