Fast Algorithms for Segmented Regression
This work addresses the computational bottleneck in segmented regression for large datasets, offering a practical improvement over existing methods.
The paper tackles the problem of recovering a piecewise linear function from noisy samples with near-linear time algorithms, achieving speedups of three orders of magnitude compared to dynamic programming approaches while maintaining a convergence rate only off by a factor of 2 to 4.
We study the fixed design segmented regression problem: Given noisy samples from a piecewise linear function $f$, we want to recover $f$ up to a desired accuracy in mean-squared error. Previous rigorous approaches for this problem rely on dynamic programming (DP) and, while sample efficient, have running time quadratic in the sample size. As our main contribution, we provide new sample near-linear time algorithms for the problem that -- while not being minimax optimal -- achieve a significantly better sample-time tradeoff on large datasets compared to the DP approach. Our experimental evaluation shows that, compared with the DP approach, our algorithms provide a convergence rate that is only off by a factor of $2$ to $4$, while achieving speedups of three orders of magnitude.