```
import time
import itertools
from queue import Queue
from threading import Thread
import requests
```

You've probably heard about the feared "Python GIL". In this lecture we'll learn about it and why it's such a problem. But first, let's talk about computing prime numbers.

We'll write a **very** naive function to check if a number is prime or not:

```
def is_prime(n):
if n in (2, 3):
return True
if n % 2 == 0:
return False
for divisor in range(3, n, 2):
if n % divisor == 0:
return False
return True
```

```
is_prime(19)
```

```
is_prime(8)
```

We have several numbers in a text file that we want to check for their "primality":

```
!cat prime_mixture.txt
```

```
with open('prime_mixture.txt') as fp:
numbers = [int(n.strip()) for n in fp.read().split() if n]
```

```
numbers[:5]
```

```
start = time.time()
```

```
is_prime(numbers[0])
```

```
time.time() - start
```

Approximately 0.5 seconds (we don't need to be very accurate, don't worry). If we have 10 prime numbers:

```
len(numbers)
```

we could expect a single threaded solution to take ~5 seconds. Let's start with that approach first:

#### Single threaded approach¶

A single threaded solution is the most basic one we can think of. No threads, locks, queues or concurrency. Plain old Python code to get the job done:

```
def search_primes_single_thread(numbers):
return [n for n in numbers if is_prime(n)]
```

```
start = time.time()
```

```
results = search_primes_single_thread(numbers)
```

```
f"Time: {time.time() - start} seconds. Found {len(results)} primes out of {len(numbers)} total numbers."
```

As we can see, it took less than 5 seconds, but it's within the same order of magnitude: between 1 and 10 seconds.

#### Speeding things up with multiple threads¶

We quickly realize that we could improve a lot the solution by using multiple threads. If I have 16 cores in this computer, each one of them can calculate a prime at the same time, and we'll be done a lot quicker. How quicker? Well, assuming I have 16 cores, and each core will definitively take 1 number to process, our solution should take no more than a second. The slowest prime to compute will be the total time.

Let's try it out!

```
def check_prime_worker(number, results):
if is_prime(number):
results.append(number)
```

(We should potentially use a thread safe collection, I know in CPython list append operations are thread safe)

```
results = []
```

```
threads = [Thread(target=check_prime_worker, args=(number, results)) for number in numbers]
```

```
start = time.time()
```

```
[t.start() for t in threads];
```

```
[t.join() for t in threads];
```

```
f"Time: {time.time() - start} seconds. Found {len(results)} primes out of {len(numbers)} total numbers."
```

**WHAT! 😦** 4 seconds! That's even **slower** than the sequential single-threaded solution. What is going on here? 🤔

Congratulations, let me introduce you to:

## What is the GIL?¶

The GIL is a safety mechanism added to **cpython** (and Cpython only) that prevents multiple threads from running at the same time. This is completely counter intuitive, as the objective of having multiple threads is to make them run **in parallel**: that is, 2 or more at the same time.

Understanding the GIL is outside the scope of this presentation. There's one important point that you should understand about the GIL: your threads will "suffer" the GIL **ONLY** if you're running CPU Bound tasks. I/O tasks will *release* the GIL and let other threads run.

If you want to know more about the GIL, this talk from Larry Hastings is just amazing: https://www.youtube.com/watch?v=KVKufdTphKs

As a summary:

- The GIL is a necessary evil. It made Python's C API simple to understand and extend.
- Guido is open to remove the GIL, ONLY if it doesn't hurt performance (Larry demonstrated in another talk that it's very hard to achieve that)
- Only CPython has a GIL. Other interpreters (Jython, PyPy, Ironpython) don't.
- The GIL was "broken" in Python 2, but has been fixed in Python 3.2. We've reached "peak" performance so Larry doesn't think there will be significant improvements there.

## An I/O bound demonstration¶

Let's revisit the example from our first lesson and run an I/O bound task multithreaded to prove the GIL is not an issue in this case. We'll use our `crypto_db`

command, in this case with a 2 second artificial sleep parameter:

```
$ python crypto_db --sleep 2
```

#### Sequential first¶

We'll check our crypto prices again. This time, each request is artificially delayed 3 seconds. We'll check only 3 prices, for the same date, so it's going to take approximately 6 seconds:

```
BASE_URL = "http://localhost:8080"
```

```
EXCHANGES = ['bitfinex', 'bitstamp', 'kraken']
```

```
start = time.time()
```

```
prices = [
requests.get(f"{BASE_URL}/price/{exchange}/btc/2020-04-01").json()['close']
for exchange in EXCHANGES
]
```

```
time.time() - start
```

As expected, ~6 seconds.

#### Now, the multithreaded version¶

We'll try now the same example with multiple threads. If our claims about the GIL being released by I/O Bound tasks is true, then the whole task should take around 2 seconds, let's check it out:

```
def check_price(exchange, results):
return requests.get(f"{BASE_URL}/price/{exchange}/btc/2020-04-01").json()['close']
```

```
results = []
```

```
threads = [Thread(target=check_price, args=(exchange, results)) for exchange in EXCHANGES]
```

```
start = time.time()
```

```
[t.start() for t in threads];
```

```
[t.join() for t in threads];
```

```
time.time() - start
```

Success! We've now corroborated that the GIL is actually released when waiting for I/O, which makes our programs "feel" like running in parallel.

## Summary¶

In this lesson we learned about one of the most "hated" features of Python: the GIL. Every post you read that is titled "why not Python?" or "Python vs [insert language]" will mention the GIL as a major drawback.

In our next lesson we'll learn how we can improve our code if it's CPU bound.