How to calculate optimal batch size?
From the recent Deep Learning book by Goodfellow et al., chapter 8: Minibatch sizes are generally driven by the following factors: Larger batches provide a more accurate estimate of the gradient, but with less than linear returns. Multicore architectures are usually underutilized by extremely small batches. This motivates using some absolute minimum batch size, below … Read more