LIS 4273 Adv Statistics: Module 12 Time Series

This week I work with Time Series objects in R and make use of Holt-Winters Forecasting and exponential smoothing to find the trend in seemingly random data. For this I created an R script which you can find on my GitHub here, or with the link at the bottom of this post.

Student tracks their monthly credit card charges from 2012 to 2013. At the end of 2013 they have created the following table.

Month20122013
January31.939.4
February2736.2
March31.340.5
April3144.6
May39.446.8
June40.744.7
July42.352.2
August49.554
September4548.8
October5055.8
November50.958.7
December58.563.4
Table 1: Charges on Student’s credit card over a two year period. Data pulled from Canvas assignment page.

To find any trend the first step will be to make a simple time series plot to see how the raw data behaves. One way you can do this is with the following R code.

# Set up raw data
charges <- c(31.9, 27, 31.3, 31, 39.4, 40.7, 42.3, 49.5, 45, 50, 50.9, 58.5, 
             39.4, 36.2, 40.5, 44.6, 46.8, 44.7, 52.2, 54, 48.8, 55.8, 58.7, 63.4)

# Make raw data a time series object
chargesOverTime <- ts(charges, start = 2012, frequency = 12)
chargesOverTime

# Plot raw data as Time Series graph before analysis
plot.ts(chargesOverTime)
Code language: R (r)

First the raw data is coded into the script, then a time series object can be created with the data. Here we tell R that the start is 2012 and the data is monthly. Then we can plot the time series data which will give the following.

Plot 1: Raw Time Series Data, 2012 to 2013.

Now we can start identifying the component trends of this data. First there appears to be a positive trend combined with a seasonal trend of some kind. It is hard to identify by just looking at the raw plot, so we can decompose the time series into its components. Every time series can have a general trend component, a seasonal or cyclical component, and a random component that cannot be explained. We can see these components by running the following R code.

# Decompose components of original time series dataset
chargesOverTimeComponents <- decompose(chargesOverTime)
chargesOverTimeComponents

# Plot components of raw time series data
plot(chargesOverTimeComponents)
Code language: R (r)

This pulls the components apart using things such as moving averages and exponential smoothing. It will give the following plots.

Plot 2: Decomposition of raw time series data.

With this we can verify the upward trend and seasonal components of the data. Now we know how to set up the arguments to do the exponential smoothing. Knowing that it is additive and contains a seasonal trend we will want to not set the arguments beta and gamma to false. The only argument passed should be the time series object we created earlier. The R code for the exponential smoothing will be as follows.

# Perform Exponential Smoothing 
chargesOverTimeForecast <- HoltWinters(chargesOverTime)
chargesOverTimeForecast

# Plot with Exponential Smoothing added
plot(chargesOverTimeForecast)
Code language: R (r)

This code will give the following plot.The red line is the fitted line, the black is the observed raw values.

Plot 3: Holt-Winters Exponential smoothing, 2013

This exponential smoothing shows the general trend in the data. Here the trend is only shown for 2013, because when the HoltWinters function is run like how I wrote it, the first period or season of the data is not included in the exponential smoothing. In this data the first season is also the first half of the data. You could run this line of code:

plot(chargesOverTimeForecast, xlim = c(2012, 2014))
Code language: R (r)

And you get this plot with all the data plotted with the fitted line in red:

Plot 4: Holt-Winters Exponential Smoothing 2012 through 2013.

From this exponential smoothing we can see that at the beginning of every year Student will spend the least amount, and then gradually will spend more over the course of the year, spending the most around Christmas. There is also a small hump in the summer months which could be indicative of increased spending during summer vacation. Student also appears to have a sharp decrease of spending immediately after New Years, which may be a result of seeing the bank account at the end of the year and resolving to not spend as much this year, only to spend even more than they did previously.

The output from the doing the exponential smoothing can also tell us something about how the trend was made. Making a call to the variable name used for the HoltWinters function will give the output from it.

Holt-Winters exponential smoothing with trend and additive seasonal component.

Call:
HoltWinters(x = chargesOverTime)

Smoothing parameters:
 alpha: 0.4786973
 beta : 0
 gamma: 0.1

Coefficients:
           [,1]
a    51.4481469
b     0.6088578
s1   -6.6831338
s2  -10.5867440
s3   -6.6998393
s4   -3.0320795
s5   -1.4068647
s6   -4.0422184
s7    0.4727766
s8    6.6378768
s9    1.4431586
s10   5.6809745
s11   5.7999737
s12  12.6976853

Code language: plaintext (plaintext)

The most important parts of this output are the lines that have alpha, beta, and gamma. I highlighted them here for convenience. The alpha describes the levels, beta is the slope of the trend component, and gamma is for the seasonal component. The closer to 1 that a value is, indicates that more weight is placed on recent observations. For example in this output, alpha put a moderate amount of weight on recent observations meaning it is based on some recent and some more distant past observations. The slope of the trend component, beta, places no weight on the most recent observations and is set to its initial value. The seasonal component places very little weight as well indicating it is mostly based on the earliest observations.

Applying the Trend

Just knowing the trends is not helpful though, using the forecast function we can see what these trends forecast in the future. For example what if I want to know how Student might spend money with their credit card for 2014. The following R code extrapolates the trend out for 12 months.

# Create forecast for future observations for 12 months
chargesOverTimeForecast2 <- forecast(chargesOverTimeForecast, h = 12)
chargesOverTimeForecast2

# Plot 12 month forecast
plot(chargesOverTimeForecast2)
Code language: R (r)

And you get the following plot.

Plot 5: Holt-Winters 12 Month Forecast for Student, 2012 through 2014

Here, the raw data I plotted along side the forecast. The blue line is the mean of expected values. The dark grey around it shows the 80% confidence interval and the light grey is the 95% confidence interval. Of course this is not a prediction of the future as much as an extrapolation of the trend. Over the next 12 month period anything could happen that could nullify the trend. This forecast makes the large assumption of all variables held constant.

Relevant Links:

GitHub: https://github.com/SimonLiles/LIS4273AdvStatistics/blob/master/LIS4273Mod12.R