Profile picture

Sorting

Last updated: March 28th, 20192019-03-28Project preview

rmotr


Sorting

Sorting in Pandas is extremely easy. There are two important methods to be used for Series and DataFrames that will take care of the job: sort_index and sort_values.

Let's start with Series, which are the most intuitive ones:

purple-divider

Hands on!

In [ ]:
import pandas as pd
import numpy as np
from datetime import datetime, timedelta

green-divider

Sorting Series

In many use cases Series values need to be sorted in some way. To do that we can use sort_values() method:

In [ ]:
s = pd.Series(np.random.randint(100, size=10))

s
In [ ]:
s.sort_values()
In [ ]:
s.sort_values(ascending=False)

As you can see, sorting is as simple as invoking the sort_values method. By default, values are sorted in ascending order, which you can customize with the ascending parameter. Indexes can also be sorted:

In [ ]:
timestamps = np.random.randint(
    int((datetime.now() - timedelta(hours=5)).timestamp()),
    int(datetime.now().timestamp()),
    size=10
)
timestamps
In [ ]:
index = pd.to_datetime(timestamps, unit='s')
index
In [ ]:
s = pd.Series(
    np.random.randint(500, 550, size=10),
    index=index
)

s
In [ ]:
s.sort_index()
In [ ]:
s.sort_index().plot()

green-divider

Sorting DataFrames

Sorting DataFrames is equally as simple, using both the sort_values and sort_index methods:

In [ ]:
df = pd.DataFrame(
    np.random.randint(100, size=(10, 4)),
    columns=['Column %s' % i for i in ('A', 'B', 'C', 'D')],
    index=index
)

df

sort_index works exactly in the same way:

In [ ]:
df.sort_index()
In [ ]:
df.sort_index(inplace=True)

We need a few duplicate values to explain multiple-column sorting later:

In [ ]:
df.iloc[2:6, 0] = 99
In [ ]:
df
In [ ]:
df.sort_values('Column B')
In [ ]:
df.sort_values(by=['Column A', 'Column D'])

purple-divider

Notebooks AI
Notebooks AI Profile20060