# Filtering with Boolean Arrays *(also called masks)*¶

We saw in our previous lessons, how can we use boolean operators as broadcasting with numpy arrays. We'll see now how we can combine boolean arrays to regular selection to create *boolean filters*.

## Hands on!¶

```
import sys
import numpy as np
```

First, let's review regular selection with a simple array:

```
a = np.arange(6)
a
```

If we want to access the *first* and *last* elements, there are several options. For example:

**1. Regular indexing**

```
a[0], a[-1]
```

**2. Muliple indices**

```
a[[0, -1]]
```

Aside from these two known ones, we can also use *boolean arrays*:

**3. Boolean Array**

```
a[[True, False, False, False, False, True]]
```

When passing a boolean array to the regular selection operation, we're basically indicating what elements we want to retrieve (all those `True`

values).

#### Relation with Broadcasting¶

As you saw in our previous lesson, broadcasting can also be performed with boolean operators. And the result, was a boolean array:

```
a > 2
```

In this case, there are `True`

values for those elements satisfying our condition (`element > 2`

).

We can now combine this operation with the selection process, to create filters. For example, "all the elements that are greater than 2":

```
a[a > 2]
```

More examples:

```
a % 2 == 0
```

```
a[a % 2 == 0]
```

```
a.mean()
```

```
a[a > a.mean()]
```

### Logical Operators¶

You're probably already familiar with python's logical operators (`and`

, `or`

and `not`

). We'll see now `numpy`

's counterparts. From now on, this table might be useful:

Python | Numpy |
---|---|

`and` |
`&` |

`or` |
`|` |

`not` |
`~` |

The best way to understand logical operators is with examples, let's do a few for them:

```
a
```

##### Example 1: All elements greater or equals to 2 *AND* less than 5¶

```
(a >= 2) & (a < 5)
```

##### Example 2: Elements equals to 0 *OR* equals to 1:¶

```
(a == 0) | (a == 1)
```

As you've seen in these examples, **it's very important to include parenthesis on your expressions**, in other case, they'd fail.

Now check these examples with the *not* (`~`

) operator:

##### Example 3: All elements greater than 2:¶

```
a > 2
```

```
~(a <= 2)
```

the results are the same! It's the same to say "everything greater than 2" and "all the elements that are ***not*** less or equals to 2.

### Logical Operators in filtering¶

We can combine boolean filtering (masks) with logical expressions, to achieve more advanced filtering. We'll use the same examples as before:

##### Example 1: All elements greater or equals to 2 *AND* less than 5¶

```
(a >= 2) & (a < 5)
```

```
a[(a >= 2) & (a < 5)]
```

##### Example 2: Elements equals to 0 *OR* equals to 1:¶

```
(a == 0) | (a == 1)
```

```
a[(a == 0) | (a == 1)]
```

##### Example 3: All elements greater than 2:¶

```
~(a <= 2)
```

```
a[~(a <= 2)]
```

### Assignment with condition¶

Finally, we'll see how we can leverage filtering and masks to also make modifications to our arrays.

```
a = np.arange(10)
a
```

It's possible to modify elements from an array that match a given condition, for example:

```
a[a >= 4] = 99
```

The array `a`

has been modified:

```
a
```

It's also possible to modify elements, based on those same elements. Another example:

```
a = np.arange(10)
a
```

```
a[a >= 4] = (a[a >= 4] * 100)
```

```
a
```

In this example, we've modified each element greater or equals to 4, for the same element multiplied by `100`

.

### Methods and patterns for boolean arrays¶

There are two useful methods that we use with boolean arrays.

`any`

returns`True`

if there's**at least**one`True`

value. Otherwise returns`False`

`all`

returns`True`

if**ALL**the elements are`True`

. Otherwise returns`False`

```
a = np.array([99, 4, 101, 251])
a
```

```
a >= 99
```

```
(a >= 99).any()
```

```
a >= 99
```

```
(a >= 99).all()
```

```
a < 1_000
```

```
(a < 1_000).all()
```

It's also very common try answering "how many elements satisfy the condition?". `any`

will tell you if there's at least 1 element, but, how many? For that, we'll use the `np.sum`

function.

```
a
```

```
a > 99
```

```
(a > 99).any()
```

```
np.sum(a > 99)
```

We now know that 2 elements are greater than 99.

```
np.sum(a[a > 99])
```

And the sum of that 2 elements is 352.