Showing posts with label Mathematical Foundations for Data Science. Show all posts
Showing posts with label Mathematical Foundations for Data Science. Show all posts

Saturday, April 20, 2024

From Simple Mean to Exponential Weighting: Unveiling the EMA (Exponential Moving Average)

The concept of averages is fundamental across various fields. In finance, specifically technical analysis, the exponential moving average (EMA) plays a crucial role in identifying trends and making trading decisions. But how is the EMA derived? It turns out, it has a clever connection to the basic idea of the mean!
Let's look at the theory and then at the mathematical derivation at the end...

The Familiar Mean

We all know the formula for the mean (or average) of a set of numbers. Let's say we have a series of data points, X₁, X₂, ..., Xₙ. The mean (M) is simply the sum of these values divided by the number of data points (n):

M = (Σ Xᵢ) / n

This gives equal weight to every data point in the series.

Enter the Exponential Moving Average

The EMA, however, prioritizes recent data points. It assigns higher weightage to the most current observation and gradually less weightage to older values. This creates a smoother, more responsive trend indicator compared to the simple mean.

Deriving the EMA Formula

So, how do we bridge the gap between the mean and the EMA? We can achieve this by incorporating a weighting factor (α) between 0 and 1. This factor determines the relative importance given to the current data point (Xₜ) compared to the previous EMA (EMAₜ₋₁).

Here's where the connection to the mean emerges. We can rewrite the formula for the mean with this weighting factor:

Smoothed Average = α * Xₜ + (1 - α) * (Previous Average)

This "smoothed average" concept is the foundation of the EMA. But there's a twist! For the EMA, the "previous average" isn't just the mean of all past data points. It's the previous day's EMA (EMAₜ₋₁). This creates a recursive relationship, where each EMA value depends on the EMA from the prior period.

The Final EMA Formula

To account for this, we can rewrite the equation again, substituting the "previous average" with the EMA from the previous day:

EMAₜ = α * Xₜ + (1 - α) * EMAₜ₋₁

This is the magic formula for the EMA! The α factor controls the responsiveness of the EMA. A higher α gives more weight to the current data point, resulting in a more reactive EMA that captures recent trends quickly. Conversely, a lower α emphasizes older data points, leading to a smoother but potentially less responsive EMA.

Understanding the Connection

The connection between the mean and the EMA lies in the concept of a weighted average. The EMA is essentially a continuously updated mean, with exponentially decreasing weights assigned to past data points. This allows the EMA to react more dynamically to recent changes while still incorporating historical information.

By understanding the derivation from the mean, you gain a deeper appreciation for the EMA's role in technical analysis. It's a powerful tool that helps identify trends and make informed trading decisions based on both recent and historical price movements.

Now the Derivation...

Tags: Mathematical Foundations for Data Science,

Friday, April 12, 2024

When not to use Poisson Distribution for prediction?

Previously we had seen how we were trying to fit Poisson Distribution to the traffic data on our blog. 

Link to the Post

Today, we study why our analysis could be wrong.

By looking at this plot, we suspect three things: 1. The data has outliers. A bunch of points around May 2021. 2. The data has some local trend. Upward trend till Jan 2022 and then downward trend till Mar 2023. 3. Because of the above two points, we feel: there is constant rate (average) for the events occurring within a specific time interval.

Can we use poisson distribution when we suspect there are outliers in the data?

The Poisson distribution has its limitations when dealing with outliers. Here's why:

  • Underlying Assumption: The Poisson distribution assumes a constant rate of events across the observed interval. Outliers, by definition, deviate significantly from this expected pattern.

  • Impact on Results: If outliers are present, the Poisson model might not accurately reflect the actual probability of observing certain counts. This can lead to misleading interpretations of the data.

However, there are still ways to approach this situation:

  • Outlier Detection: You can use statistical methods to identify potential outliers in your data. Techniques like Grubbs' test or examining standardized residuals can help [3].

  • Alternative Models: Depending on the nature of the outliers, alternative models like the negative binomial distribution might be a better fit. This model accounts for overdispersion, which is a higher variance than expected by the Poisson distribution, often caused by outliers [1].

  • Data Transformation: In some cases, transforming the data (like using the Anscombe transform) before applying the Poisson model can help reduce the influence of outliers [3].

Ultimately, the decision of using the Poisson distribution with outliers depends on the severity of the outliers and the purpose of your analysis.

Here are some additional resources you might find helpful:

And can we use poisson distribution when we suspect there is a trend in the data?

No, the standard Poisson distribution is not ideal for data with a suspected trend. Here's why:

  • Constant Rate Assumption: The Poisson distribution assumes a constant rate (average) for the events occurring within a specific time interval. A trend implies a non-constant rate, where the frequency of events increases or decreases over time.

  • Misleading Results: Using the Poisson model with a trend can lead to inaccurate estimates of event probabilities and unreliable hypothesis tests.

However, there are ways to analyze count data with trends:

  • Poisson Regression: This statistical technique incorporates a trend term into the Poisson model. You can use a linear trend or even more complex functions to capture the non-constant rate [2].

  • Segmented Poisson Regression: This approach divides the data into segments with potentially different rates. It's useful when the trend has breaks or shifts at specific points [2].

  • Change-Point Detection: Methods exist to statistically identify the time points where the event rate changes, allowing for further analysis of segments with constant rates [1].

In conclusion, the standard Poisson distribution is not suitable for data with trends. Consider using Poisson regression or other techniques that account for the non-constant event rate.

Here are some additional resources you might find helpful:

Tags: Mathematical Foundations for Data Science,Technology,

Monday, February 12, 2024

Ex 1.12 - Pattern Recognition and ML by Christopher Bishop

Pre-read

Question

Solution from Book

Solution Using ChatGPT

Are you aware of the result E[xn, xm] = mu^2 + I . sigma ^ 2

I assume you meant E[XnXm]=μ2+Iσ2E[X_n \cdot X_m] = \mu^2 + I \cdot \sigma^2, where II is the Kronecker delta, which is 1 when n=mn = m and 0 otherwise.

In that case, E[XnXm]E[X_n \cdot X_m] simplifies to:

E[XnXm]={μ2+σ2if n=mμ2if nmE[X_n \cdot X_m] = \begin{cases} \mu^2 + \sigma^2 & \text{if } n = m \\ \mu^2 & \text{if } n \neq m \end{cases}

This is because when n=mn = m, you have the product of a variable with itself, and the additional σ2\sigma^2 term comes from the covariance term. When nmn \neq m, the covariance term is zero, and you are left with the square of the mean.

If you have further questions or if there's anything specific you'd like to clarify, feel free to ask!

Can you derive this expression?

Certainly! Let's derive the expression for E[XnXm]E[X_n \cdot X_m] for two continuous random variables XnX_n and XmX_m from the same Gaussian distribution N(μ,σ2)N(\mu, \sigma^2) with mean μ\mu and variance σ2\sigma^2.

The expectation of the product is given by:

E[XnXm]=xnxmf(xn)f(xm)dxndxmE[X_n \cdot X_m] = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} x_n \cdot x_m \cdot f(x_n) \cdot f(x_m) \,dx_n \,dx_m

Since XnX_n and XmX_m are from the same distribution, let's denote f(x)f(x) as the probability density function (PDF) of XX. The expression becomes:

E[XnXm]=xnxmf(x)f(x)dxndxmE[X_n \cdot X_m] = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} x_n \cdot x_m \cdot f(x) \cdot f(x) \,dx_n \,dx_m

Now, split the integral into two cases: n=mn = m and nmn \neq m.

Case 1: n=mn = m

E[XnXm]=xn2f(x)f(x)dxndxmE[X_n \cdot X_m] = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} x_n^2 \cdot f(x) \cdot f(x) \,dx_n \,dx_m

=(xn2f(x)dxn)(f(x)dxm)= \left(\int_{-\infty}^{\infty} x_n^2 \cdot f(x) \,dx_n\right) \cdot \left(\int_{-\infty}^{\infty} f(x) \,dx_m\right)

=μ2+σ2= \mu^2 + \sigma^2

This result comes from the fact that the integral of x2x^2 over the entire range for a Gaussian distribution with mean μ\mu is μ2+σ2\mu^2 + \sigma^2.

Case 2: nmn \neq m

E[XnXm]=xnxmf(x)f(x)dxndxmE[X_n \cdot X_m] = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} x_n \cdot x_m \cdot f(x) \cdot f(x) \,dx_n \,dx_m

=(xnf(x)dxn)(xmf(x)dxm)= \left(\int_{-\infty}^{\infty} x_n \cdot f(x) \,dx_n\right) \cdot \left(\int_{-\infty}^{\infty} x_m \cdot f(x) \,dx_m\right)

=μμ= \mu \cdot \mu

This result comes from the fact that the integral of xx over the entire range for a Gaussian distribution with mean μ\mu is μ\mu.

Therefore, combining both cases:

E[XnXm]=μ2+Iσ2E[X_n \cdot X_m] = \mu^2 + I \cdot \sigma^2

where II is the Kronecker delta, equal to 1 when n=mn = m and 0 otherwise.

References

Tags: Mathematical Foundations for Data Science,Machine Learning,