The Statistical Learning Theory (SLT) provides the theoretical background to ensure that a supervised algorithm generalizes the mapping $f: \mathcal{X} \to \mathcal{Y}$ given $f$ is selected from its search space bias $\mathcal{F}$. This formal result depends on the Shattering coefficient function $\mathcal{N}(\mathcal{F},2n)$ to upper bound the empirical risk minimization principle, from which one can estimate the necessary training sample size to ensure the probabilistic learning convergence and, most importantly, the characterization of the capacity of $\mathcal{F}$, including its under and overfitting abilities while addressing specific target problems. In this context, we propose a new approach to estimate the maximal number of hyperplanes required to shatter a given sample, i.e., to separate every pair of points from one another, based on the recent contributions by Har-Peled and Jones in the dataset partitioning scenario, and use such foundation to analytically compute the Shattering coefficient function for both binary and multi-class problems. As main contributions, one can use our approach to study the complexity of the search space bias $\mathcal{F}$, estimate training sample sizes, and parametrize the number of hyperplanes a learning algorithm needs to address some supervised task, what is specially appealing to deep neural networks. Experiments were performed to illustrate the advantages of our approach while studying the search space $\mathcal{F}$ on synthetic and one toy datasets and on two widely-used deep learning benchmarks (MNIST and CIFAR-10). In order to permit reproducibility and the use of our approach, our source code is made available at~\url{https://bitbucket.org/rodrigo_mello/shattering-rcode}.
翻译:统计学理论( SLT) 提供了理论背景, 以确保一个监管的算法对映射进行概括化 $f :\ mathcal{X}\ to mathcal{Y} $f : 从搜索空间偏差中选择 $\ mathcal{F}$。 这个正式结果取决于 Shalting 系数函数 $\ mathcal{N} (\ mathcal{F} 2n) 到 经验风险最小化原则的上限, 由此可以估计必要的培训样本大小, 以确保概率化学习趋近, 最重要的是, $mmathalcal{F} 的能力特征化, 包括 处理特定目标问题时能力不足和过大。 在这方面, 我们提出一个新的方法来估计 打破给定样本所需的超大计划的最大数量, 也就是说, 根据 Har- Peledlead和 Jones在数据分类分析中所做的贡献, 利用这种基础, 用于分析地平流化的计算方法 。