# Boolean Arrays

Last updated: March 27th, 2019

# Filtering with Boolean Arrays (also called masks)¶

We saw in our previous lessons, how can we use boolean operators as broadcasting with numpy arrays. We'll see now how we can combine boolean arrays to regular selection to create boolean filters.

## Hands on!¶

In [ ]:
import sys
import numpy as np


First, let's review regular selection with a simple array:

In [ ]:
a = np.arange(6)
a


If we want to access the first and last elements, there are several options. For example:

1. Regular indexing

In [ ]:
a[0], a[-1]


2. Muliple indices

In [ ]:
a[[0, -1]]


Aside from these two known ones, we can also use boolean arrays:

3. Boolean Array

In [ ]:
a[[True, False, False, False, False, True]]


When passing a boolean array to the regular selection operation, we're basically indicating what elements we want to retrieve (all those True values).

As you saw in our previous lesson, broadcasting can also be performed with boolean operators. And the result, was a boolean array:

In [ ]:
a > 2


In this case, there are True values for those elements satisfying our condition (element > 2).

We can now combine this operation with the selection process, to create filters. For example, "all the elements that are greater than 2":

In [ ]:
a[a > 2]


More examples:

In [ ]:
a % 2 == 0

In [ ]:
a[a % 2 == 0]

In [ ]:
a.mean()

In [ ]:
a[a > a.mean()]


### Logical Operators¶

You're probably already familiar with python's logical operators (and, or and not). We'll see now numpy's counterparts. From now on, this table might be useful:

Python  Numpy
and &
or |
not ~

The best way to understand logical operators is with examples, let's do a few for them:

In [ ]:
a

##### Example 1: All elements greater or equals to 2 *AND* less than 5¶
In [ ]:
(a >= 2) & (a < 5)

##### Example 2: Elements equals to 0 *OR* equals to 1:¶
In [ ]:
(a == 0) | (a == 1)


As you've seen in these examples, it's very important to include parenthesis on your expressions, in other case, they'd fail.

Now check these examples with the not (~) operator:

##### Example 3: All elements greater than 2:¶
In [ ]:
a > 2

In [ ]:
~(a <= 2)


the results are the same! It's the same to say "everything greater than 2" and "all the elements that are *not* less or equals to 2.

### Logical Operators in filtering¶

We can combine boolean filtering (masks) with logical expressions, to achieve more advanced filtering. We'll use the same examples as before:

##### Example 1: All elements greater or equals to 2 *AND* less than 5¶
In [ ]:
(a >= 2) & (a < 5)

In [ ]:
a[(a >= 2) & (a < 5)]

##### Example 2: Elements equals to 0 *OR* equals to 1:¶
In [ ]:
(a == 0) | (a == 1)

In [ ]:
a[(a == 0) | (a == 1)]

##### Example 3: All elements greater than 2:¶
In [ ]:
~(a <= 2)

In [ ]:
a[~(a <= 2)]


### Assignment with condition¶

Finally, we'll see how we can leverage filtering and masks to also make modifications to our arrays.

In [ ]:
a = np.arange(10)
a


It's possible to modify elements from an array that match a given condition, for example:

In [ ]:
a[a >= 4] = 99


The array a has been modified:

In [ ]:
a


It's also possible to modify elements, based on those same elements. Another example:

In [ ]:
a = np.arange(10)
a

In [ ]:
a[a >= 4] = (a[a >= 4] * 100)

In [ ]:
a


In this example, we've modified each element greater or equals to 4, for the same element multiplied by 100.

### Methods and patterns for boolean arrays¶

There are two useful methods that we use with boolean arrays.

• any returns True if there's at least one True value. Otherwise returns False
• all returns True if ALL the elements are True. Otherwise returns False
In [ ]:
a = np.array([99, 4, 101, 251])
a

In [ ]:
a >= 99

In [ ]:
(a >= 99).any()

In [ ]:
a >= 99

In [ ]:
(a >= 99).all()

In [ ]:
a < 1_000

In [ ]:
(a < 1_000).all()


It's also very common try answering "how many elements satisfy the condition?". any will tell you if there's at least 1 element, but, how many? For that, we'll use the np.sum function.

In [ ]:
a

In [ ]:
a > 99

In [ ]:
(a > 99).any()

In [ ]:
np.sum(a > 99)


We now know that 2 elements are greater than 99.

In [ ]:
np.sum(a[a > 99])


And the sum of that 2 elements is 352.