LGMEJun 6

Orthogonality and Dimensionality in Airline Cluster Analysis using PCA and Kernel PCA

Andreas Schlapbach
arXiv:2606.08322v14.8
Predicted impact top 96% in LG · last 90 daysOriginality Synthesis-oriented
AI Analysis

For researchers using clustering on high-dimensional economic data, this work shows that PCA-based dimensionality reduction can preserve cluster structure and that collinearity can mislead cluster count selection.

The paper replicates a clustering experiment on US airline profit cycles (1995-2020) and finds that k-means in 3D PCA space yields identical assignments to 7D raw space, while kernel PCA reveals no nonlinear structure. Silhouette analysis suggests the data supports only three clusters, not six, due to collinearity.

To characterize the US airline profit cycles from 1995 to 2020, the authors of Renold et al. (2023) combine k-means clustering, principal component analysis, and system dynamic modelling. We replicate their clustering experiment in three spaces -- the original 7-dimensional raw-variable space, a 3-dimensional PC score space, and a 4-dimensional PC score space using their dataset gratefully included in the paper. We show that the six-cluster taxonomy is geometrically robust: k-means in 3-PC space produces bit-for-bit identical cluster assignments relative to 7D raw space. As a nonlinearity check we apply kernel PCA under six kernels spanning three families plus a linear baseline. All six kernels preserve the six-cluster assignment in 2D. A 1D diagnostic tightens this: the linear kernel conflates the COVID year C_3 with the peak-profit cluster C_0, whereas all five non-baseline kernels shift C_3 to overlap only the post-financial-crisis cluster C_5. Agreement across the kernel families confirms an intrinsically linear manifold with no hidden curvature. The silhouette criterion reveals that the dataset structurally supports only three clusters, not six. Collinearity in the raw 7D space suppresses the silhouette signal that would otherwise identify k=3 as the structurally motivated choice.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes