Profile picture

Co-founder @ RMOTR

Advanced Time Series

Last updated: November 6th, 20182018-11-06Project preview
In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline
Import the data
In [8]:
df = pd.read_csv('data/bookings.csv', dtype={
    'Year': str
})
df.head()
Out[8]:
Year Month Bookings
0 2006 Jan 383
1 2006 Feb 366
2 2006 Mar 250
3 2006 Apr 318
4 2006 May 334
In [9]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 120 entries, 0 to 119
Data columns (total 3 columns):
Year        120 non-null object
Month       120 non-null object
Bookings    120 non-null int64
dtypes: int64(1), object(2)
memory usage: 2.9+ KB
Build the index (year + month)
In [10]:
df['Period'] = df['Year'] + '-' + df['Month']
In [11]:
df.head()
Out[11]:
Year Month Bookings Period
0 2006 Jan 383 2006-Jan
1 2006 Feb 366 2006-Feb
2 2006 Mar 250 2006-Mar
3 2006 Apr 318 2006-Apr
4 2006 May 334 2006-May
In [12]:
df['Period'] = pd.to_datetime(df['Period'])
In [13]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 120 entries, 0 to 119
Data columns (total 4 columns):
Year        120 non-null object
Month       120 non-null object
Bookings    120 non-null int64
Period      120 non-null datetime64[ns]
dtypes: datetime64[ns](1), int64(1), object(2)
memory usage: 3.8+ KB
In [23]:
df.head()
Out[23]:
Bookings
Period
2006-01-01 383
2006-02-01 366
2006-03-01 250
2006-04-01 318
2006-05-01 334
In [ ]:
df.
In [ ]:
df.
In [26]:
df.to_csv('data/bookings-processed.csv')
In [14]:
df.head()
Out[14]:
Year Month Bookings Period
0 2006 Jan 383 2006-01-01
1 2006 Feb 366 2006-02-01
2 2006 Mar 250 2006-03-01
3 2006 Apr 318 2006-04-01
4 2006 May 334 2006-05-01
In [15]:
df.set_index('Period', inplace=True)
In [16]:
df.head()
Out[16]:
Year Month Bookings
Period
2006-01-01 2006 Jan 383
2006-02-01 2006 Feb 366
2006-03-01 2006 Mar 250
2006-04-01 2006 Apr 318
2006-05-01 2006 May 334

We'll drop the unused columns

In [17]:
df.drop(['Year', 'Month'], axis='columns', inplace=True)

And this is the final result:

In [18]:
df.plot(figsize=(12, 7))
Out[18]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f6334045a58>
In [20]:
from statsmodels.tsa.seasonal import seasonal_decompose
In [21]:
result = seasonal_decompose(df['Bookings'], model='multiplicative')
fig = plt.figure()  
fig = result.plot()  
fig.set_size_inches(15, 8)
<Figure size 432x288 with 0 Axes>
In [22]:
result = seasonal_decompose(df['Bookings'], model='additive')
fig = plt.figure()  
fig = result.plot()  
fig.set_size_inches(15, 8)
<Figure size 432x288 with 0 Axes>
Notebooks AI
Notebooks AI Profile20060