Sunday, May 5, 2024

Ex 1.14 - Pattern Recognition and Machine Learning - By Christopher Bishop

Index of all exercises from the book "Pattern Recognition and Machine Learning"

Can you show that an arbitrary square matrix can be written as a sum of a symmetric and anti-symmetric matrices?

Yes, absolutely! An arbitrary square matrix can indeed be written as the sum of a symmetric and an anti-symmetric matrix. This is known as the symmetric decomposition theorem.

Here's how we can prove it:

Let's denote the arbitrary square matrix as A.

We can define two new matrices, P and Q:

  • P (symmetric matrix): P = (1/2)(A + Aᵀ)
  • Q (anti-symmetric matrix): Q = (1/2)(A - Aᵀ)

Here:

  • Aᵀ represents the transpose of matrix A (where rows and columns are swapped).

Explanation for P being symmetric:

  • (A + Aᵀ) will have all diagonal elements doubled because when a matrix is added to its transpose, the diagonal elements become the sum of their corresponding values in both matrices.
  • Dividing by 2 ensures we get the original diagonal elements.
  • Since adding the transpose doesn't change the elements above or below the diagonal, they remain the same in P.
  • As the elements above the diagonal are mirrored below the diagonal in a symmetric matrix, P is also symmetric.

Explanation for Q being anti-symmetric:

  • Subtracting the transpose flips the signs of all elements above and below the diagonal in (A - Aᵀ).
  • Dividing by 2 ensures the absolute values remain the same.
  • The diagonal elements become zero because subtracting a term from itself always results in zero.
  • Therefore, Q will have all elements above (or below) the diagonal with opposite signs, making it anti-symmetric.

Verifying the sum:

Now, let's see if A can be expressed as the sum of P and Q:

A = (1/2)(A + Aᵀ) + (1/2)(A - Aᵀ)

Since matrix addition is distributive, we can rearrange:

A = 1/2 * A + 1/2 * Aᵀ - 1/2 * Aᵀ + 1/2 * A

Combining like terms:

A = A

Therefore, we have successfully shown that the original matrix A can be expressed as the sum of a symmetric matrix (P) and an anti-symmetric matrix (Q).

Important Note:

This decomposition is unique, meaning there is only one way to express a specific matrix as the sum of a symmetric and an anti-symmetric matrix.

Example

Now consider the second order term in a higher order polynomial in D dimensions. Show that the contribution from the anti-symmetric matrix vanishes.

Absolutely, here's the mathematical demonstration:

1. Setting Up the Framework:

Let's denote:

  • D: Number of dimensions in the higher-order polynomial.
  • x: A D-dimensional vector representing a point in that space (x = [x₁, x₂, ..., x_D]).
  • A: An arbitrary square matrix of size D x D.
  • P (symmetric matrix): P = (1/2)(A + Aᵀ)
  • Q (anti-symmetric matrix): Q = (1/2)(A - Aᵀ)

2. Second-Order Term:

We're interested in the second-order term of a higher-order polynomial. This term can be represented by all possible products of two variables from the D-dimensional vector x raised to the power of one each. Mathematically, for any two dimensions i and j (1 <= i, j <= D), the second-order term involving these dimensions can be written as:

x_i * x_j

3. Contribution from Anti-Symmetric Matrix (Q):

We need to show this term (x_i * x_j) doesn't contribute to the part of the matrix arising from the anti-symmetric component (Q).

Remember, elements in Q come from the difference of corresponding terms in A and its transpose (Aᵀ). Let's analyze the product of two arbitrary elements (a_ij and a_ji) from A:

  • Diagonal Elements (i = j): In this case, a_ij and a_ji represent the same element on the diagonal of A. Since diagonals remain unchanged during transposition (Aᵀ = A for diagonal elements), their difference (a_ij - a_ji) in Q becomes zero.

  • Off-Diagonal Elements (i ≠ j): Here, a_ij represents the element in row i and column j of A, and a_ji represents the element in row j and column i. Their difference in Q will be:

(a_ij - a_ji)

This difference captures the asymmetry between the values at positions (i, j) and (j, i) in the original matrix (A).

4. Relating to Second-Order Term:

Now, let's see how this relates to the second-order term (x_i * x_j). We only care about the contribution of Q to the element at position (i, j) or (j, i) in the final matrix.

  • If i = j (diagonal element): As shown earlier, the difference in Q for diagonal elements is zero. Therefore, Q doesn't contribute to the second-order term on the diagonal (x_i^2).

  • If i ≠ j (off-diagonal element): The difference in Q captures the asymmetry between (i, j) and (j, i) in A. However, the second-order term (x_i * x_j) only considers the product of variables in specific dimensions (i and j) and doesn't inherently care about the order (i, j) or (j, i).

Conclusion:

In both cases, the contribution from the anti-symmetric component (Q) to the second-order term in the matrix is zero. This demonstrates that when considering the second-order terms of a higher-order polynomial in D dimensions and performing the symmetric decomposition, the anti-symmetric matrix (Q) has no effect.

Show that the number of independent parameters in the matrix is given by D(D + 1)/2

Absolutely, here's why the number of independent parameters in a matrix wij (where w represents an arbitrary element and i, j represent indices for rows and columns) of dimension D x D is given by D(D + 1)/2:

1. Considering Symmetry:

The key concept here is symmetry. We can analyze the matrix based on whether it's symmetric (w_ij = w_ji) or anti-symmetric (w_ij = -w_ji).

  • Symmetric Matrix: If the matrix is symmetric, then w_ij = w_ji for all i and j. This means the upper triangular part of the matrix (excluding the diagonal) is a mirror image of the lower triangular part.

  • Anti-Symmetric Matrix: If the matrix is anti-symmetric, then w_ij = -w_ji for all i and j. In this case, the upper triangular part (excluding the diagonal) has elements with opposite signs compared to the corresponding elements in the lower triangular part, and the diagonal elements are all zero.

2. Independent Parameters:

We're interested in the number of independent parameters, which refers to the minimum number of values we need to specify to completely define the matrix.

  • Symmetric Matrix: For a symmetric matrix, we only need to specify the unique elements. These include the diagonal elements (D of them) and the elements in the upper triangular part (excluding the diagonal). The upper triangular part has D rows, and for each row, we need to specify D - (row index) elements (since the rest are mirrored from the lower triangle). So, the total number of unique elements for the upper triangle is:

D + (D - 1) + (D - 2) + ... + 1 = D(D + 1)/2

Therefore, in a symmetric matrix wij, the number of independent parameters is D (diagonal) + D(D - 1)/2 (upper triangle) = D(D + 1)/2.

  • Anti-Symmetric Matrix: For an anti-symmetric matrix, we only need to specify the elements in the upper triangular part (excluding the diagonal), as the lower triangle elements are negative counterparts of the upper triangle and the diagonal elements are zero. Similar to the symmetric case, the number of independent parameters in the upper triangle is D(D + 1)/2.

3. General Case (Arbitrary Matrix):

An arbitrary matrix wij can be decomposed into a symmetric component (by adding the matrix and its transpose and dividing by 2) and an anti-symmetric component (by subtracting the matrix and its transpose and dividing by 2). As shown above, both the symmetric and anti-symmetric parts have D(D + 1)/2 independent parameters. Since any arbitrary matrix can be represented by the sum of these components, the total number of independent parameters in the original matrix wij is also D(D + 1)/2.

In conclusion, irrespective of whether the matrix is symmetric, anti-symmetric, or neither, the number of independent parameters required to define a D x D matrix is D(D + 1)/2. This considers the interplay between diagonal elements and the unique elements in the upper triangular part.

No comments:

Post a Comment