1 min readMay 1, 2018
The analysis is conducted on the student-mat dataset! Here I read in the data from the math class:
# Read in class scores
df = pd.read_csv('data/student-mat.csv')
The data is collected from a Portuguese school, so maybe that is the source of confusion?
The other available data is from a Portuguese language class in a Portuguese school. If you go to the UCI Machine Learning Repository, you can download both datasets.