We tackle the problem of automatic portrait matting on mobile devices. The proposed model is aimed at attaining real-time inference on mobile devices with minimal degradation of model performance. Our model MMNet, based on multi-branch dilated convolution with linear bottleneck blocks, outperforms the state-of-the-art model and is orders of magnitude faster. The model can be accelerated four times to attain 30 FPS on Xiaomi Mi 5 device with moderate increase in the gradient error. Under the same conditions, our model has an order of magnitude less number of parameters and is faster than Mobile DeepLabv3 while maintaining comparable performance. The accompanied implementation can be found at \url{https://github.com/hyperconnect/MMNet}.
翻译:我们处理移动设备上自动肖像配对的问题。 提议模型的目的是在模型性能最小化的情况下对移动设备进行实时推断。 我们的模型MMNet基于多分支扩展与线性瓶颈区块的结合,比最先进的型号更快。 模型可以加速四倍, 在小米米5 上达到30 FPS, 梯度误差会适度增加。 在同样的条件下, 我们的模型的参数数量要小于一个数量级, 比移动深Labv3 更快, 并同时保持类似的性能。 相应的执行可以在 url{https://github.com/hyper connect/ MMNNet} 找到 。