Debugging the Discriminant Algorithm for Reduced Multidimensional Features in Python

Debugging a complex algorithm can feel like solving a tricky puzzle, especially when working with multidimensional data. One such challenge arises when implementing Algorithm 2 from a research paper focused on Discriminant Analysis for Reduced Multidimensional Features.

The idea behind this algorithm is to reduce dimensionality while preserving class-discriminative features, making it useful for applications like machine learning and signal processing. However, when implementing it using Tensorly, Scipy, and Numpy, unexpected issues arise—such as non-convergence and complex values in the projection matrix.

This guide walks through the debugging process, from understanding the algorithm to identifying and resolving key issues.

How the Algorithm Works

Before debugging, it’s crucial to understand the internal mechanics of Algorithm 2.

Inputs Required

The algorithm requires:

Multidimensional feature tensors representing different classes.
A threshold value to determine convergence.
Optimization parameters that influence the iterative process.

Breaking Down the Steps

The procedure follows these core steps:

Centralization of Classes: Each class tensor is mean-subtracted so that feature comparisons are fair.
Computation of S_w (Within-Class Scatter Matrix) and S_b (Between-Class Scatter Matrix): These matrices quantify intra-class variations and inter-class separability.
Iterative Optimization: The algorithm repeatedly adjusts a projection matrix (ψ) to maximize class separability.

Role of Psi and F_matrix

The projection matrix (ψ) plays a critical role in reducing high-dimensional data while maintaining key distinctions between classes. The F_matrix helps refine this transformation, ensuring optimal separation of features in lower-dimensional space.

Recognizing the Bug

While implementing Algorithm 2, two major issues emerged.

1. Non-Convergence

The optimization process failed to reach a stable solution. The algorithm either ran indefinitely or stopped at poor local minima, preventing reliable feature extraction.

2. Complex Values in ψ

Some projections unexpectedly contained large complex numbers, making the final transformation unusable for real-world applications.

Inspecting the Code

A closer look at the implementation revealed several potential reasons for these failures.

Initializations of S_w and S_b were inconsistent, sometimes leading to poorly conditioned matrices.
The trace ratio computation, central to optimizing ψ, behaved unpredictably due to numerical stability issues.
Eigenvector calculations sometimes produced complex values, even when real eigenvalues were expected.

Here’s a snippet from the implementation that required debugging:


# Computing scatter matrices
Sw = np.zeros((d, d))
Sb = np.zeros((d, d))

for c in range(num_classes):
    X_c = class_data[c] - np.mean(class_data[c], axis=0)
    Sw += X_c.T @ X_c  # Within-class scatter
    mean_diff = np.mean(class_data[c], axis=0) - overall_mean
    Sb += np.outer(mean_diff, mean_diff) * len(class_data[c])

# Solve the generalized eigenvalue problem
eigvals, eigvecs = np.linalg.eig(np.linalg.inv(Sw) @ Sb)

# Sorting eigenvectors based on eigenvalues
sorted_indices = np.argsort(-eigvals.real)
psi = eigvecs[:, sorted_indices[:r]]

Fixing the Issues

Improving Algorithm Convergence

1. Adjusting Tolerance Values

A very tight convergence criterion can lead the iterative process to oscillate indefinitely. Changing the tolerance value from 1e-9 to 1e-6 improved stability.

2. Improving Trace Ratio Calculation

Instead of direct matrix inversion, using Scipy’s robust solver for the eigenvalue equation ensured better numerical stability:


from scipy.linalg import eigh

eigvals, eigvecs = eigh(Sb, Sw)  # More numerically stable

Handling Complex Numbers in ψ

1. Ensuring Real Eigenvalues

To guarantee a real ψ, verifying S_b and S_w are symmetric before computing eigenvalues reduced unexpected imaginary components.


Sw = 0.5 * (Sw + Sw.T)  # Force symmetry
Sb = 0.5 * (Sb + Sb.T)

2. Filtering out Small Imaginary Parts

If tiny imaginary components still appeared due to floating-point precision, they were safely discarded:


psi = np.real_if_close(psi, tol=1e-6)

Validating the Fixes

After applying these changes, thorough testing was needed to confirm their effectiveness.

Generating Sample Data

A synthetic dataset using NumPy helped verify if the debugged algorithm produced meaningful outputs.


np.random.seed(42)
class1 = np.random.randn(100, 5) + 1
class2 = np.random.randn(100, 5) - 1
class_data = [class1, class2]

Executing the Algorithm

Execution with the modified parameters produced real-valued transformations, and convergence was significantly faster.

Comparing Results

The corrected implementation produced clearly separated feature transformations, confirmed by plotting projected data distributions.

Key Takeaways

Debugging Algorithm 2 required:

Stabilizing matrix computations to prevent convergence failures.
Refining trace ratio calculations for better numerical stability.
Eliminating complex numbers by enforcing symmetry in scatter matrices.

These improvements made the discriminant algorithm more robust and efficient for reducing multidimensional features.

Appendix

Alternative Eigenvalue Computation

Sometimes, the Generalized Schur Decomposition is more numerically reliable than computing eigenvalues directly.


from scipy.linalg import schur

T, Z = schur(Sb, Sw, output='real')
psi = Z[:, :r]  # Extract dominant components

If you’re implementing something similar, consider checking eigenvalue behavior closely. Many numerical pitfalls can arise due to small floating-point inaccuracies.

What other techniques have you found useful for stabilizing discriminant-based dimensionality reduction? Let’s discuss in the comments!