The Least Wrong Model Is Not in the Data
This addresses a foundational issue in machine learning and statistics for researchers dealing with uncertainty and model selection, but it is incremental as it builds on existing theoretical concepts like the Halting Problem.
The paper tackles the problem of model selection when multiple explanations for data exist, showing that the ideal predictive model is not computable and the best computable model cannot be found, but error bounds can be derived based on description size.
The true process that generated data cannot be determined when multiple explanations are possible. Prediction requires a model of the probability that a process, chosen randomly from the set of candidate explanations, generates some future observation. The best model includes all of the information contained in the minimal description of the data that is not contained in the data. It is closely related to the Halting Problem and is logarithmic in the size of the data. Prediction is difficult because the ideal model is not computable, and the best computable model is not "findable." However, the error from any approximation can be bounded by the size of the description using the model.