Tuesday, April 16, 2024

Time Series Analysis and Forecasting Using Exponential Moving Average (A use case of traffic prediction on my blog)

In the previous two articles in this series, we explored Poisson Distribution for modeling our traffic data on the blog ( Link to Poisson Distribution Based Modeling ).
And we encoutered the challenges there of outliers and local trends as shown in this post: Data Issues

This triggered us to explore further and we came up with the idea of using Exponential Moving Average (EMA) based model to do the traffic prediction.

All of the code and data is available to download from this link:
Download Code and Data

First, a visual story:

1: Plotting EMA
2: EMA numbers we got
3: Predictions using ExponentialSmoothing (or EMA)

How we used Gemini to help us?

Can you share exponential moving average based python code for 1D time series prediction?

Absolutely, here's the Python code for 1D time series prediction using Exponential Moving Average (EMA):

Python
import pandas as pd

    def predict_ema(data, alpha):
      """
      This function calculates the Exponential Moving Average (EMA) for a given time series data.
    
      Args:
          data (pandas.Series): The time series data.
          alpha (float): The smoothing factor (0 < alpha <= 1).
    
      Returns:
          pandas.Series: The EMA of the time series data.
      """
      ema = [data.iloc[0]]  # Initialize with the first data point
      for i in range(1, len(data)):
        ema.append((alpha * data.iloc[i]) + ((1 - alpha) * ema[i-1]))
      return pd.Series(ema)
    
    # Example usage
    # Assuming your time series data is loaded in a pandas Series named 'values'
    predictions = predict_ema(values, alpha=0.5)  # Adjust alpha as needed
    
    # Plot the original data and the EMA predictions
    values.plot(label="Original Data")
    predictions.plot(label="EMA Prediction")
    

This code defines a function predict_ema that takes the time series data and the smoothing factor (alpha) as arguments. The function iterates through the data points and calculates the EMA using the following formula at each step:

EMA[i] = (alpha * data[i]) + ((1 - alpha) * EMA[i-1])
    
  • alpha: Smoothing factor (0 < alpha <= 1). A higher alpha gives more weight to recent data points, resulting in a more responsive but potentially noisier EMA.
  • data[i]: The current data point.
  • EMA[i-1]: The EMA of the previous data point.

The example usage demonstrates how to call the function with your data and plot the original data along with the EMA predictions.

Note:

  • This is a basic implementation. For more advanced functionalities, consider libraries like statsmodels.tsa.holtwinters which offer functionalities for trend and seasonality handling.
  • Exponential Moving Averages are best suited for capturing recent trends. They might not be ideal for data with strong seasonality or long-term trends.

Using statsmodels?

Certainly, here's the code for using statsmodels.tsa.holtwinters for exponential moving average (EMA) prediction in Python:

Python
from statsmodels.tsa.holtwinters import ExponentialSmoothing

    def predict_ema_statsmodels(data, alpha=0.5):
      """
      This function uses statsmodels to fit an Exponential Smoothing model (equivalent to EMA) 
      and generate predictions for a 1D time series data.
    
      Args:
          data (pandas.Series): The time series data.
          alpha (float): The smoothing factor (0 < alpha <= 1).
    
      Returns:
          pandas.Series: The predicted values using EMA.
      """
      model = ExponentialSmoothing(data, trend="add", seasonal=None)  # Set trend to 'add' for EMA
      model_fit = model.fit()
      predictions = model_fit.forecast(len(data))
      return predictions
    
    # Example usage
    # Assuming your time series data is loaded in a pandas Series named 'values'
    predictions = predict_ema_statsmodels(values, alpha=0.5)  # Adjust alpha as needed
    
    # Plot the original data and the EMA predictions
    values.plot(label="Original Data")
    predictions.plot(label="EMA Prediction")
    

This code utilizes the ExponentialSmoothing class from statsmodels.tsa.holtwinters. Here's a breakdown of the key points:

  1. Model Definition:

    • We create an ExponentialSmoothing model instance with:
      • data: The time series data (values in this example).
      • trend="add": This specifies the trend component as additive, which is equivalent to a basic EMA. Setting seasonal=None excludes seasonality.
  2. Model Fitting:

    • model_fit = model.fit(): This fits the EMA model to the data.
  3. Prediction:

    • predictions = model_fit.forecast(len(data)): This generates predictions for the next len(data) periods using the fitted model.

Advantages of using statsmodels:

  • Flexibility: It allows for handling trend and seasonality by adjusting the model arguments.
  • Additional functionalities: The model provides methods for calculating fit statistics, confidence intervals, and more.

Remember to replace values with your actual time series data and adjust the alpha value for smoothing as needed.

How do we validate the results?

I can provide some insights on interpreting the Exponential Moving Average (EMA) results and offer pointers for further validation:

General Observations from the Image:

  • The y-axis likely represents the predicted values (possibly scaled).
  • The x-axis seems to represent time steps.
  • There's a visible upward trend in the prediction, indicating a positive bias.

Code Validation Techniques:

  1. Visual Inspection:

    • Plot the original time series data alongside the predicted EMA values. This helps identify how well the EMA captures the underlying trend and seasonality (if applicable).
  2. Error Metrics:

    • Calculate error metrics like Mean Squared Error (MSE) or Mean Absolute Error (MAE) to quantify the difference between the predictions and actual future values (assuming you have those). Lower errors suggest a better fit.
  3. Domain Knowledge:

    • Consider your understanding of the time series data's domain. Does the predicted trend align with your expectations? Are there any outliers or unexpected patterns in the predictions?

Additional Considerations:

  • The value of the smoothing factor (alpha) can significantly impact the EMA's responsiveness to trends. Experiment with different alpha values to see how they affect the predictions.
  • If your data exhibits seasonality, you might want to explore seasonal extensions of exponential smoothing models provided by statsmodels.tsa.holtwinters.

By following these suggestions and incorporating your domain knowledge, you can effectively validate your EMA model and assess its suitability for your time series forecasting task.

Conclusion

Based on the results we are getting from ExponentialSmoothing: 90 10811.852982 91 10840.919992 92 10869.987003 We can say that we can see around 10K monthly views in the coming 2-3 months. On a side note: Views for April 2024 have already touched 8500 at the point of writing this.

No comments:

Post a Comment