Keras uses way too much GPU memory when calling train_on_batch, fit, etc
It is a very common mistake to forget that the activations, gradients and optimizer moment tracking variables also take VRRAM, not just the parameters, increasing memory usage quite a bit. The backprob calculations themselves make it so the training phase takes almost double the VRAM of forward / inference use of the neural net, and … Read more