The Many-to-Many Mapping Between the Concordance Correlation Coefficient and the Mean Square Error
This work addresses a gap in understanding utility functions for researchers in fields like sequence prediction and assay validation, though it is incremental as it builds on existing metrics.
The paper tackles the problem of mapping between the mean square error (MSE) and concordance correlation coefficient (CCC) metrics, revealing that minimizing MSE does not always maximize CCC and providing precise conditions for optimization.
We derive the mapping between two of the most pervasive utility functions, the mean square error ($MSE$) and the concordance correlation coefficient (CCC, $ρ_c$). Despite its drawbacks, $MSE$ is one of the most popular performance metrics (and a loss function); along with lately $ρ_c$ in many of the sequence prediction challenges. Despite the ever-growing simultaneous usage, e.g., inter-rater agreement, assay validation, a mapping between the two metrics is missing, till date. While minimisation of $L_p$ norm of the errors or of its positive powers (e.g., $MSE$) is aimed at $ρ_c$ maximisation, we reason the often-witnessed ineffectiveness of this popular loss function with graphical illustrations. The discovered formula uncovers not only the counterintuitive revelation that `$MSE_1<MSE_2$' does not imply `$ρ_{c_1}>ρ_{c_2}$', but also provides the precise range for the $ρ_c$ metric for a given $MSE$. We discover the conditions for $ρ_c$ optimisation for a given $MSE$; and as a logical next step, for a given set of errors. We generalise and discover the conditions for any given $L_p$ norm, for an even p. We present newly discovered, albeit apparent, mathematical paradoxes. The study inspires and anticipates a growing use of $ρ_c$-inspired loss functions e.g., $\left|\frac{MSE}{σ_{XY}}\right|$, replacing the traditional $L_p$-norm loss functions in multivariate regressions.