Well-known activation functions like ReLU or Leaky ReLU are non-differentiable at the origin. Over the years, many smooth approximations of ReLU have been proposed using various smoothing techniques. We propose new smooth approximations of a non-differentiable activation function by convolving it with approximate identities. In particular, we present smooth approximations of Leaky ReLU and show that they outperform several well-known activation functions in various datasets and models. We call this function Smooth Activation Unit (SAU). Replacing ReLU by SAU, we get 5.12% improvement with ShuffleNet V2 (2.0x) model on CIFAR100 dataset.
翻译:众所周知的激活功能, 如 ReLU 或 Leaky ReLU 或 Leaky ReLU 等, 起源是不可区分的。 多年来, 使用各种平滑技术提出了许多RLU 的平稳近似值 。 我们提出新的无差别激活功能的平稳近近似值, 包括大致身份 。 特别是, 我们给出了 leaky ReLU 的平稳近似值, 并显示这些功能在各种数据集和模型中优于几个众所周知的激活功能 。 我们称此功能为平滑激活单位 。 用 SAU 替换 ReLU, 我们用 ShuffleNet V2 (2.0x) 模型在 CIRA100 数据集上改进了5. 12% 。