Data-Free/Data-Sparse Softmax Parameter Estimation with Structured Class Geometries
This addresses the problem of parameter estimation for softmax models in data-scarce scenarios for machine learning practitioners, but it is incremental as it builds on existing geometric methods.
The paper tackles softmax parameter estimation without labeled data by using a priori geometric class boundary information, showing that it reduces to solving a linear system derived from convex polytopes, which can yield closed-form solutions and avoid costly optimization. It also identifies cases where such solutions fail to exist, proving that some classification problems cannot be learned with softmax models.
This note considers softmax parameter estimation when little/no labeled training data is available, but a priori information about the relative geometry of class label log-odds boundaries is available. It is shown that `data-free' softmax model synthesis corresponds to solving a linear system of parameter equations, wherein desired dominant class log-odds boundaries are encoded via convex polytopes that decompose the input feature space. When solvable, the linear equations yield closed-form softmax parameter solution families using class boundary polytope specifications only. This allows softmax parameter learning to be implemented without expensive brute force data sampling and numerical optimization. The linear equations can also be adapted to constrained maximum likelihood estimation in data-sparse settings. Since solutions may also fail to exist for the linear parameter equations derived from certain polytope specifications, it is thus also shown that there exist probabilistic classification problems over m convexly separable classes for which the log-odds boundaries cannot be learned using an m-class softmax model.