1.1. Experiments

Last updated: May 26th, 20202020-05-26Project preview
In [1]:
import pandas as pd


You have been working with datasets, so here we have a very small dataset I created to look at some examples that will help us understand important concepts.

In [2]:
dataset =  pd.DataFrame({
    'Person #':[1,2,3,4,5,6,7,8,9,10],
Person # City Age
0 1 SF 41
1 2 SF 26
2 3 NY 28
3 4 NY 53
4 5 NY 32
5 6 SF 51
6 7 NY 65
7 8 SF 49
8 9 SF 25
9 10 SF 33

In it we can see 10 rows, each representing a person with the city they live in (San Francisco or New York) and their age.

In probability, and in particular in data science, we are going to have to perform experiments. But what is an experiment?

An experiment) (sometimes called trial) is any procedure that we can repeat.

For example, let's perform the experiment of printing how many different cities are there in the data set and repeat it 5 times:

In [27]:

Now, let’s perform the experiment of picking a person and printing the city they live in and repeat it 5 times:

In [32]:

What do we notice?

In the first experiment, we always get the same result, no matter how many times we perform the experiment. If I ask you what result you would get if you perform the experiment a 6th time, you would be certain that the answer is 2.

This is what we call a deterministic experiment.

In the second experiment, the possible results are San Francisco and New York, but before carrying out the experiment we are not sure which one of the two will come up. There is uncertainty of what the result may be if we perform the experiment a 6th time.

This is what we call a random experiment.

In probability we will work with random experiments, so in further videos we will be introducing some vocabulary and notation to work with them.

But now let's go and practice performing and classifying experiments!

Notebooks AI
Notebooks AI Profile20060