Intro to Dictionaries and Sets

Last updated: May 17th, 20192019-05-17Project preview

Dictionaries!

Dictionaries are a completely different data structure from what we've seen so far. A dictionary is usually said "a mapping" type and they're different from the "sequences" we've worked with (lists, tuples).

A simple dictionary:

In [ ]:
{
    'name': 'Jane Doe',
    'email': 'jane@rmotr.com',
    'age': 27,
    'city': 'San Jose',
    'state': 'CA'
}

As you can see, a dictionary stores the values of a user, but with a corresponding "label" (name, email, age, etc). The same information could have been stored in a list:

In [1]:
#        0              1           2     3         4
l = ['Jane Doe', 'jane@rmotr.com', 27, 'San Jose', 'CA']
l
Out[1]:
['Jane Doe', 'jane@rmotr.com', 27, 'San Jose', 'CA']
In [8]:
l[1]
Out[8]:
'jane@rmotr.com'

But by looking at that list, how do you know what each fields represent? How do you know that San Jose is the city and not their school?

Dictionaries solve this problem, they have a key (the label) for each value, which provides instant documentation.

Properties of dictionaries

  • Unordered
  • Mutable
  • Key-Value pairs
  • Keys must be unique

Constructing dictionaries

As you've seen, we use {} to construct dictionaries. Dictionaries hare heterogeneous, we can mix key values:

In [4]:
{
    'name': 'Jane',
    19: 'some value',
    (1, 1, 2): 'a tuple as a key?'
}
Out[4]:
{'name': 'Jane', 19: 'some value', (1, 1, 2): 'a tuple as a key?'}

Even though we can use multiple different types of keys, we try to keep it simple and just use strings. Let's create our user dict again:

In [5]:
user = {
    'name': 'Jane Doe',
    'email': 'jane@rmotr.com',
    'age': 27,
    'city': 'San Jose',
    'state': 'CA'
}
Accessing values:
In [6]:
user['name']
Out[6]:
'Jane Doe'
In [7]:
user['age']
Out[7]:
27
Creating new values:
In [9]:
# country didn't exist
user['country'] = 'US'
In [10]:
user
Out[10]:
{'name': 'Jane Doe',
 'email': 'jane@rmotr.com',
 'age': 27,
 'city': 'San Jose',
 'state': 'CA',
 'country': 'US'}
In [11]:
user['country']
Out[11]:
'US'
What happens if we try to access a key that doesn't exist?
In [12]:
user['school']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-12-3108039ab3ac> in <module>
----> 1 user['school']

KeyError: 'school'

An error is raised. Python is very strict when accessing dictionary keys. They must exist. There are two ways of fixing this:

Option 1: Checking if the key exists
In [15]:
'school' in user
Out[15]:
False
In [16]:
'age' in user
Out[16]:
True
In [17]:
if 'school' in user:
    print(user['school'])
else:
    print("Key `school` doesn't exist")
Key `school` doesn't exist
Option 2: Using the get method:

The get method will not fail if the key doesn't exist. It'll just return None in that case.

In [18]:
user['school']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-18-3108039ab3ac> in <module>
----> 1 user['school']

KeyError: 'school'
In [19]:
user.get('school')  # this is None
In [20]:
print(user.get('school'))  # when we print it we "see it"
None
In [21]:
user.get('email')
Out[21]:
'jane@rmotr.com'

get also accepts a "default" value in case the the key doesn't exist:

In [22]:
user.get('school', 'San Jose High School')  # just in case, we provide a default value
Out[22]:
'San Jose High School'
In [23]:
user.get('email', 'default@rmotr.com')
Out[23]:
'jane@rmotr.com'
Warning! the in operator only checks for "keys":
In [24]:
user
Out[24]:
{'name': 'Jane Doe',
 'email': 'jane@rmotr.com',
 'age': 27,
 'city': 'San Jose',
 'state': 'CA',
 'country': 'US'}
In [25]:
"Jane Doe" in user
Out[25]:
False
Accessing values

We can access only values of a dictionary with the values method:

In [26]:
user.values()
Out[26]:
dict_values(['Jane Doe', 'jane@rmotr.com', 27, 'San Jose', 'CA', 'US'])

Now we can ask if "Jane Doe" is within the collection of values:

In [ ]:
"Jane Doe" in user.values()
Accessing only keys

As we have a values method, there's also a keys method that will retrieve only the keys:

In [27]:
user.keys()
Out[27]:
dict_keys(['name', 'email', 'age', 'city', 'state', 'country'])
Combining dictionaries

We can use the update method to combine dictonaries:

In [28]:
user
Out[28]:
{'name': 'Jane Doe',
 'email': 'jane@rmotr.com',
 'age': 27,
 'city': 'San Jose',
 'state': 'CA',
 'country': 'US'}
In [29]:
school_info = {
    'high school': 'San Jose High School',
    'university': 'San Jose State University'
}

We "merge" school_info into user:

In [30]:
user.update(school_info)

And now user contains:

In [31]:
user
Out[31]:
{'name': 'Jane Doe',
 'email': 'jane@rmotr.com',
 'age': 27,
 'city': 'San Jose',
 'state': 'CA',
 'country': 'US',
 'high school': 'San Jose High School',
 'university': 'San Jose State University'}

all the info from both dicts. school_info is still the same:

In [32]:
school_info
Out[32]:
{'high school': 'San Jose High School',
 'university': 'San Jose State University'}
Deleting keys-values
In [33]:
user['high school']
Out[33]:
'San Jose High School'
In [ ]:
user['high school'] = "Value"
In [34]:
del user['high school']
In [35]:
user
Out[35]:
{'name': 'Jane Doe',
 'email': 'jane@rmotr.com',
 'age': 27,
 'city': 'San Jose',
 'state': 'CA',
 'country': 'US',
 'university': 'San Jose State University'}

Trying to delete a key that doesn't exist also raises an exception:

In [39]:
hs = user['high school']
In [40]:
del user['high school']

Let's restore high school back in place:

In [43]:
user['high school'] = school_info['high school']
Removing elements with pop

del removes the element and now it's completely lost. The pop method will remove the element, but also return it, so we can store it in a variable for later usage:

In [44]:
hs = user.pop('high school')
In [45]:
hs
Out[45]:
'San Jose High School'

☝️ it was returned. I can try storing it in a variable:

In [46]:
uni = user.pop('university')

The key is no longer there:

In [47]:
user
Out[47]:
{'name': 'Jane Doe',
 'email': 'jane@rmotr.com',
 'age': 27,
 'city': 'San Jose',
 'state': 'CA',
 'country': 'US'}

But I have the value stored in the uni var:

In [ ]:
uni

pop also fails if the key doesn't exist:

In [48]:
user.pop('school')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-48-3eb21467ae0f> in <module>
----> 1 user.pop('school')

KeyError: 'school'

But we can provide a "default" value that will prevent the exception:

In [50]:
user.pop('school', None)
In [51]:
user.pop('school', 'San Jose High School')
Out[51]:
'San Jose High School'

Iterating over dictionaries:

It's extremely simple (and convenient) to iterate over dictionaries. Check this out:

In [ ]:
{
    'name': 'Jane Doe',
    'email': 'jane@rmotr.com',
    'age': 27,
    'city': 'San Jose',
    'state': 'CA'
}
In [53]:
user
Out[53]:
{'name': 'Jane Doe',
 'email': 'jane@rmotr.com',
 'age': 27,
 'city': 'San Jose',
 'state': 'CA',
 'country': 'US'}
In [54]:
user.keys()
Out[54]:
dict_keys(['name', 'email', 'age', 'city', 'state', 'country'])
In [55]:
user.values()
Out[55]:
dict_values(['Jane Doe', 'jane@rmotr.com', 27, 'San Jose', 'CA', 'US'])
In [58]:
user.items()
Out[58]:
dict_items([('name', 'Jane Doe'), ('email', 'jane@rmotr.com'), ('age', 27), ('city', 'San Jose'), ('state', 'CA'), ('country', 'US')])
In [52]:
for elem in user:
    print(elem)
name
email
age
city
state
country

As you can see, we're iterating by "key". We could access each value internally with that key:

In [56]:
for key in user:
    value = user[key]
    print('The key is "{}" and the value is "{}"'.format(key, value))
The key is "name" and the value is "Jane Doe"
The key is "email" and the value is "jane@rmotr.com"
The key is "age" and the value is "27"
The key is "city" and the value is "San Jose"
The key is "state" and the value is "CA"
The key is "country" and the value is "US"

We can also rely on the keys and values methods:

In [57]:
for value in user.values():
    print(value)
Jane Doe
jane@rmotr.com
27
San Jose
CA
US

The wonderful world of items ❤️

So sometimes we need to iterate over BOTH keys and values. But the keys method returns only keys and values only values. The question is:

why not both

That's why we're going to use the amazing items() method, it'll return BOTH keys and values:

In [56]:
for key in user:
    value = user[key]
    print('The key is "{}" and the value is "{}"'.format(key, value))
The key is "name" and the value is "Jane Doe"
The key is "email" and the value is "jane@rmotr.com"
The key is "age" and the value is "27"
The key is "city" and the value is "San Jose"
The key is "state" and the value is "CA"
The key is "country" and the value is "US"
In [59]:
for key, value in user.items():
    print('The key is "{}" and the value is "{}"'.format(key, value))
The key is "name" and the value is "Jane Doe"
The key is "email" and the value is "jane@rmotr.com"
The key is "age" and the value is "27"
The key is "city" and the value is "San Jose"
The key is "state" and the value is "CA"
The key is "country" and the value is "US"
In [ ]:
 
Notebooks AI
Notebooks AI Profile20060