梯度下降法算法用梯度乘以一个称为学习速率(有时也称为步长)的标量,以确定下一个点的位置。如果学习速率太小,则会使收敛过慢,如果学习速率太大,则会导致代价函数振荡。

最新内容

We consider the problem of hyperparameter tuning in training neural networks with user-level differential privacy (DP). Existing approaches for DP training (e.g., DP Federated Averaging) involve bounding the contribution of each user's model update by {\em clipping} them to a fixed norm. However there is no good {\em a priori} setting of the clipping norm across tasks and learning settings: the update norm distribution depends on the model architecture and loss, the amount of data on each device, the client learning rate, and possibly various other parameters. In this work, we propose a method wherein instead of using a fixed clipping norm, one clips to a value at a specified quantile of the distribution of update norms, where the value at the quantile is itself estimated online, with differential privacy. Experiments demonstrate that adaptive clipping to the median update norm works well across a range of federated learning problems, eliminating the need to tune any clipping hyperparameter.

0
0
下载
预览

最新论文

We consider the problem of hyperparameter tuning in training neural networks with user-level differential privacy (DP). Existing approaches for DP training (e.g., DP Federated Averaging) involve bounding the contribution of each user's model update by {\em clipping} them to a fixed norm. However there is no good {\em a priori} setting of the clipping norm across tasks and learning settings: the update norm distribution depends on the model architecture and loss, the amount of data on each device, the client learning rate, and possibly various other parameters. In this work, we propose a method wherein instead of using a fixed clipping norm, one clips to a value at a specified quantile of the distribution of update norms, where the value at the quantile is itself estimated online, with differential privacy. Experiments demonstrate that adaptive clipping to the median update norm works well across a range of federated learning problems, eliminating the need to tune any clipping hyperparameter.

0
0
下载
预览
父主题
Top