Abstract |
Autoregressive models parametrized as deep neural networks (called PixelCNN) achieve state-of-the-art results in image density estimation. Although training is fast, sampling is costly, requiring one network evaluation per pixel; O(N) for N pixels. In this talk, I will describe a parallelized PixelCNN that achieves competitive density estimation and orders of magnitude speedup - O(log N) sampling instead of O(N) - enabling the practical generation of high-resolution images. Results will be presented on class-conditional image generation, text-to-image synthesis, and action-conditional video generation. In the second part of the talk, I will discuss density estimation in the low-data regime as a meta learning problem. |