1.2. Outcomes and Sample Space

Last updated: January 13th, 2020
In [1]:
import pandas as pd


Outcomes and Sample Space¶

We have defined what an experiment is, we have performed some and classified them in random or deterministic. Now we are going to define two important concepts about experiments.

Remember our experiment of choosing a person and printing their city:

In [2]:
dataset =  pd.DataFrame({
'Person #':[1,2,3,4,5,6,7,8,9,10],
'City':['SF','SF','NY','NY','NY','SF','NY','SF','SF','SF'],
'Age':[41,26,28,53,32,51,65,49,25,33]
})
dataset

Out[2]:
Person # City Age
0 1 SF 41
1 2 SF 26
2 3 NY 28
3 4 NY 53
4 5 NY 32
5 6 SF 51
6 7 NY 65
7 8 SF 49
8 9 SF 25
9 10 SF 33
In [6]:
dataset.sample(1)['City'].values[0]

Out[6]:
'SF'

The result we get will be called outcome. Of course, as this is a random experiment, we can are not certain which outcome we will get, but we could say that "SF is the outcome of our experiment" or "NY is the outcome of our experiment".

We are going to organize all the possible outcomes in a set that will be called sample space, and will be denoted with the capital Greek letter omega: $\Omega$.

In our example, the sample space would be $\Omega = \{$SF, NY$\}$.

We could think of another experiment: pick a person and print their age.

In [7]:
dataset.sample(1)['Age'].values[0]

Out[7]:
26

In this case, the sample space would be $\Omega = \{25,26,28,32,33,41,49,51,53,65\}$.

A generic outcome of the sample space would be denoted with a lowercase omega: $\omega$.

Going back to the example of counting how many different cities are there in our dataset, what do you think the sample space will be?

In [8]:
len(dataset['City'].unique())

Out[8]:
2

$\Omega = \{2\}$

So an interesting thing to notice is that the sample space of a deterministic experiment will always have only one element.

Now let's go and practice finding some sample spaces from different experiments!