In the previous two articles in this series, we explored Poisson Distribution for modeling our traffic data on the blog ( Link to Poisson Distribution Based Modeling ). And we encoutered the challenges there of outliers and local trends as shown in this post: Data Issues This triggered us to explore further and we came up with the idea of using Exponential Moving Average (EMA) based model to do the traffic prediction. All of the code and data is available to download from this link: Download Code and DataFirst, a visual story:
1: Plotting EMA 2: EMA numbers we got 3: Predictions using ExponentialSmoothing (or EMA)
How we used Gemini to help us?
Can you share exponential moving average based python code for 1D time series prediction?
Absolutely, here's the Python code for 1D time series prediction using Exponential Moving Average (EMA):
import pandas as pd
def predict_ema(data, alpha):
"""
This function calculates the Exponential Moving Average (EMA) for a given time series data.
Args:
data (pandas.Series): The time series data.
alpha (float): The smoothing factor (0 < alpha <= 1).
Returns:
pandas.Series: The EMA of the time series data.
"""
ema = [data.iloc[0]] # Initialize with the first data point
for i in range(1, len(data)):
ema.append((alpha * data.iloc[i]) + ((1 - alpha) * ema[i-1]))
return pd.Series(ema)
# Example usage
# Assuming your time series data is loaded in a pandas Series named 'values'
predictions = predict_ema(values, alpha=0.5) # Adjust alpha as needed
# Plot the original data and the EMA predictions
values.plot(label="Original Data")
predictions.plot(label="EMA Prediction")
This code defines a function predict_ema
that takes the time series data and the smoothing factor (alpha
) as arguments. The function iterates through the data points and calculates the EMA using the following formula at each step:
EMA[i] = (alpha * data[i]) + ((1 - alpha) * EMA[i-1])
alpha
: Smoothing factor (0 < alpha <= 1). A higher alpha gives more weight to recent data points, resulting in a more responsive but potentially noisier EMA.data[i]
: The current data point.EMA[i-1]
: The EMA of the previous data point.
The example usage demonstrates how to call the function with your data and plot the original data along with the EMA predictions.
Note:
- This is a basic implementation. For more advanced functionalities, consider libraries like
statsmodels.tsa.holtwinters
which offer functionalities for trend and seasonality handling. - Exponential Moving Averages are best suited for capturing recent trends. They might not be ideal for data with strong seasonality or long-term trends.
Using statsmodels?
Certainly, here's the code for using statsmodels.tsa.holtwinters
for exponential moving average (EMA) prediction in Python:
from statsmodels.tsa.holtwinters import ExponentialSmoothing
def predict_ema_statsmodels(data, alpha=0.5):
"""
This function uses statsmodels to fit an Exponential Smoothing model (equivalent to EMA)
and generate predictions for a 1D time series data.
Args:
data (pandas.Series): The time series data.
alpha (float): The smoothing factor (0 < alpha <= 1).
Returns:
pandas.Series: The predicted values using EMA.
"""
model = ExponentialSmoothing(data, trend="add", seasonal=None) # Set trend to 'add' for EMA
model_fit = model.fit()
predictions = model_fit.forecast(len(data))
return predictions
# Example usage
# Assuming your time series data is loaded in a pandas Series named 'values'
predictions = predict_ema_statsmodels(values, alpha=0.5) # Adjust alpha as needed
# Plot the original data and the EMA predictions
values.plot(label="Original Data")
predictions.plot(label="EMA Prediction")
This code utilizes the ExponentialSmoothing
class from statsmodels.tsa.holtwinters
. Here's a breakdown of the key points:
-
Model Definition:
- We create an
ExponentialSmoothing
model instance with:data
: The time series data (values
in this example).trend="add"
: This specifies the trend component as additive, which is equivalent to a basic EMA. Settingseasonal=None
excludes seasonality.
- We create an
-
Model Fitting:
model_fit = model.fit()
: This fits the EMA model to the data.
-
Prediction:
predictions = model_fit.forecast(len(data))
: This generates predictions for the nextlen(data)
periods using the fitted model.
Advantages of using statsmodels:
- Flexibility: It allows for handling trend and seasonality by adjusting the model arguments.
- Additional functionalities: The model provides methods for calculating fit statistics, confidence intervals, and more.
Remember to replace values
with your actual time series data and adjust the alpha
value for smoothing as needed.
No comments:
Post a Comment