import pandas as pd
Performing Experiments¶
Let's introduce a bigger dataset. This dataset was downloaded from Kaggle and it's called Toy Dataset because it's designed to play around with it, so we are going to use it to do some exercises and practice the new concepts we learn.
toy_dataset = pd.read_csv('toy_dataset.csv')
toy_dataset.head()
Notice that in this dataset we still have a numbered person, their city and their age, but also their gender, their income and whether they have an illness or not.
toy_dataset.shape
We can see as well that this dataset has $150000$ rows (people).
Let's start with a simple random experiment: Picking a person and printing their income.
income = toy_dataset.sample(1)['Income'].values[0]
income
Now, try and explore different possibilities for these kind of experiments: What if you choose more people? What if you print a different characteristic? What if you repeat the experiment until something happens?
We could also try a deterministic experiment: Printing the age of the first person.
first_age = toy_dataset['Age'].loc[0]
first_age
Now perform some deterministic experiments yourself!