Mathieu Germain, Karol Gregor, Iain Murray, Hugo Larochelle
A simple modification for deep AE to respect autoregressive constraints: Each input is reconstructed only from previous inputs in a given order. Constrained this way, the AE outputs can be interpreted as conditional probabilities whose product is the joint distrib. (as is the custom in sequential RNN!)