The Benefits of Over-parameterization at Initialization in Deep ReLU Networks
arXiv: Machine Learning, 2019.
It has been noted in existing literature that over-parameterization in ReLU networks generally leads to better performance. While there could be several reasons for this, we investigate desirable network properties at initialization which may be enjoyed by ReLU networks. Without making any assumption, we derive a lower bound on the layer ...More