Distributed Sparse Multicategory Discriminant Analysis
This work addresses the challenge of distributed data analysis for classification tasks, but it appears to be incremental as it extends existing sparse discriminant analysis methods to a distributed context.
The paper tackles the problem of performing sparse multicategory linear discriminant analysis when data is distributed across multiple sites by proposing a convex formulation and extending it to a distributed setting. The result shows that after a few communication rounds, the distributed version performs as well as the centralized version, with numerical studies supporting the methodology and theory.
This paper proposes a convex formulation for sparse multicategory linear discriminant analysis and then extend it to the distributed setting when data are stored across multiple sites. The key observation is that for the purpose of classification it suffices to recover the discriminant subspace which is invariant to orthogonal transformations. Theoretically, we establish statistical properties ensuring that the distributed sparse multicategory linear discriminant analysis performs as good as the centralized version after {a few rounds} of communications. Numerical studies lend strong support to our methodology and theory.