Fun with ML on Splatoon 2     About     Archive

VAE-GAN Part 3

While I was trying another method to improve my VAE-GAN implementation, I encountered OOM, so I refactored my code around forward/backward computation, following the PyTorch's DC-GAN implemenation. In this implementation, input batch is fed to the generator, and discriminator separately and loss and gradients are computed each time. Mean while, my implementation was feeding input batch to network and all the loss are computed first then each component of is updated one by one. It turned out that not only this produces gradient for already updated model, but also not as fast in PyTorch. Now after the refactoring, the training became even more stable and converged faster while yielding better result.
GAN loss now looks more stable. There is not as much sudden jump as before.
However, KL-divergence is still increasing, though growth is somewhat supressed. Intrguing...
Feature difference is also suppressed.