Kernel approximation is widely used to scale up kernel SVM training and prediction. However, the memory and computation costs of kernel approximation models are still too high if we want to deploy them on memory-limited devices such as mobile phones, smartwatches, and IoT devices. To address this challenge, we propose a novel memory and computation-efficient kernel SVM model by using both binary embedding and binary model coefficients. First, we propose an efficient way to generate compact binary embedding of the data, preserving the kernel similarity. Second, we propose a simple but effective algorithm to learn a linear classification model with ternary coefficients that can support different types of loss function and regularizer. Our algorithm can achieve better generalization accuracy than existing works on learning binary coefficients since we allow coefficient to be $-1$, $0$, or $1$ during the training stage, and coefficient $0$ can be removed during model inference for binary classification. Moreover, we provide a detailed analysis of the convergence of our algorithm and the inference complexity of our model. The analysis shows that the convergence to a local optimum is guaranteed, and the inference complexity of our model is much lower than other competing methods. Our experimental results on five large real-world datasets have demonstrated that our proposed method can build accurate nonlinear SVM models with memory costs less than 30KB.
翻译:然而,如果我们想将内核近似模型的内存和计算成本应用到移动电话、智能观察器和IoT设备等内存限制装置上,那么内核近亲模型的内存和计算成本仍然太高。为了应对这一挑战,我们建议使用二进制嵌入和二进制模型系数,以新的内存和计算高效内核SVM模型。首先,我们建议一种高效的方法,生成数据集中的缩嵌入,保存内核相似性。第二,我们建议一种简单而有效的算法,学习一种具有永恒系数的线性分类模型,用以支持不同类型的损失函数和校正处理器。我们的算法可以比现有的学习二进制系数的工作更加精确。因为我们在培训阶段允许系数为1美元、0美元或1美元,因此在模型推导中可以删除系数0美元。此外,我们对我们的算法和模型的推推论的复杂性进行详细分析,我们对我们的内核系数和推论的复杂性进行了详细分析。我们的分析表明,我们与当地最精确性数据方法的趋近程度比实际方法要低。