Tidy Data - a tutorial in R and Python

Get and clean data - an example tutorial

Download .zip Download .tar.gz View on GitHub

Welcome to the Tidy Data tutorial

This tutorial describes the transformations performed to convert the raw data (from wearables UCI repository) into a tidy data format.

Tidy data

Tidy is an adjective meaning "arranged neatly and in order", e.g. a table or a data frame where:

  • Each variable shall be in one column
  • Each different observation shall be in a different row
  • If the variable are of different types, one table for each kind of variable
  • If you have multiple tables, then shall include a column to link them

The general principles of tidy data are laid out by Hadley Wickham in this paper. You can find more information on J.T. Leek repo.

The tutorial

Check out our documentation