The VC-Dimension of Similarity Hypotheses Spaces
This provides a theoretical foundation for analyzing similarity-based learning in machine learning, though it is incremental as it extends existing VC-dimension concepts.
The paper tackles the problem of determining the VC-dimension of similarity hypothesis spaces derived from binary classifiers, showing that it is asymptotically equivalent to the VC-dimension of the original hypothesis space, specifically in Θ(vc-dimension(H)).
Given a set $X$ and a function $h:X\longrightarrow\{0,1\}$ which labels each element of $X$ with either $0$ or $1$, we may define a function $h^{(s)}$ to measure the similarity of pairs of points in $X$ according to $h$. Specifically, for $h\in \{0,1\}^X$ we define $h^{(s)}\in \{0,1\}^{X\times X}$ by $h^{(s)}(w,x):= \mathbb{1}[h(w) = h(x)]$. This idea can be extended to a set of functions, or hypothesis space $\mathcal{H} \subseteq \{0,1\}^X$ by defining a similarity hypothesis space $\mathcal{H}^{(s)}:=\{h^{(s)}:h\in\mathcal{H}\}$. We show that ${vc-dimension}(\mathcal{H}^{(s)}) \in Θ({vc-dimension}(\mathcal{H}))$.