We give the first agnostic, efficient, proper learning algorithm for monotone Boolean functions. Given $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$ uniformly random examples of an unknown function $f:\{\pm 1\}^n \rightarrow \{\pm 1\}$, our algorithm outputs a hypothesis $g:\{\pm 1\}^n \rightarrow \{\pm 1\}$ that is monotone and $(\mathrm{opt} + \varepsilon)$-close to $f$, where $\mathrm{opt}$ is the distance from $f$ to the closest monotone function. The running time of the algorithm (and consequently the size and evaluation time of the hypothesis) is also $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$, nearly matching the lower bound of Blais et al (RANDOM '15). We also give an algorithm for estimating up to additive error $\varepsilon$ the distance of an unknown function $f$ to monotone using a run-time of $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$. Previously, for both of these problems, sample-efficient algorithms were known, but these algorithms were not run-time efficient. Our work thus closes this gap in our knowledge between the run-time and sample complexity. This work builds upon the improper learning algorithm of Bshouty and Tamon (JACM '96) and the proper semiagnostic learning algorithm of Lange, Rubinfeld, and Vasilyan (FOCS '22), which obtains a non-monotone Boolean-valued hypothesis, then ``corrects'' it to monotone using query-efficient local computation algorithms on graphs. This black-box correction approach can achieve no error better than $2\mathrm{opt} + \varepsilon$ information-theoretically; we bypass this barrier by a) augmenting the improper learner with a convex optimization step, and b) learning and correcting a real-valued function before rounding its values to Boolean. Our real-valued correction algorithm solves the ``poset sorting'' problem of [LRV22] for functions over general posets with non-Boolean labels.
翻译:我们提出了第一个独立、高效、适当的单调布尔函数学习算法。给定$2^{\tilde{O}(\sqrt{n}/\varepsilon)}$个关于未知函数$f:\{\pm 1\}^n \rightarrow \{\pm 1\}$的均匀随机样本,我们的算法输出一个假设$g:\{\pm 1\}^n \rightarrow \{\pm 1\}$,它是单调的,并且与$f$的最接近的单调函数$(\mathrm{opt} + \varepsilon)$-接近,其中$\mathrm{opt}$是从$f$到最接近的单调函数的距离。算法的运行时间(因此假设的大小和计算时间)也为$2^{\tilde{O}(\sqrt{n}/\varepsilon)}$,几乎达到了Blais等人(RANDOM '15)的下限。我们还给出了一个算法,用于估计未知函数$f$到单调函数的距离,该算法的加法误差为$\varepsilon$,运行时间为$2^{\tilde{O}(\sqrt{n}/\varepsilon)}$。先前,对于这两个问题,已知样本有效的算法,但这些算法的运行时间没有效率。因此,我们的工作填补了知识中运行时间和样本复杂度之间的差距。本工作基于Bshouty与Tamon(JACM '96)的不适当学习算法和Lange、Rubinfeld和Vasilyan(FOCS '22)的适当半自适应学习算法,该算法获得了非单调布尔值假设,然后通过图上的查询有效的局部计算算法“校正”其单调性。这种黑盒校正方法在信息理论上不能实现比$2\mathrm{opt} + \varepsilon$更好的误差。我们通过以下两种方法绕过此障碍:a)将不适当学习者与凸优化步骤相结合,和b)在将其值舍入为布尔之前学习和校正实值函数。我们的实值校正算法解决了[LVR22]中关于具有非布尔标签的一般偏序函数的“偏序排序”问题。