The analysis is conducted on the student-mat dataset! Here I read in the data from the math class:

1 min readMay 1, 2018

# Read in class scores
df = pd.read_csv('data/student-mat.csv')

The data is collected from a Portuguese school, so maybe that is the source of confusion?

The other available data is from a Portuguese language class in a Portuguese school. If you go to the UCI Machine Learning Repository, you can download both datasets.

Written by Will Koehrsen

Responses (3)