Here we have trained RSM to predict the motion of bouncing balls in a box. The learned dynamics should be similar to billiard balls. This benchmark was created by Sutskever, Hinton and Taylor (2008) [2].
The left panel shows predictions from our algorithm primed with 50 frames and then allowed to self-loop its predictions for 150 frames. You can see at the top-left when it is in priming and self-looped mode. The right panel shows self-looped (i.e. generative) predictions from the original work of Sutskever et al. Qualitatively, we are pleased to note that the dynamics produced by RSM appear more like billiard balls, whereas the original results appear more like cells under a microscope!
Quantitatively, RSM beats the current benchmark for this task set by the Predictive Generative Network (PGN) of Lotter, Kreiman and Cox [3]. Although graphically simple, learning to simulate these dynamics from pixel observations is still a challenging task. Cenzato et al comment that "unsupervised learning of good representations is harder than expected even for these simple synthetic videos" [4].
For more info see
[1] Rawlinson, David, Abdelrahman Ahmed, and Gideon Kowadlo. "Learning distant cause and effect using only local and immediate credit assignment." arXiv preprint arXiv:1905.11589 (2019).
[2] Sutskever, Ilya, Geoffrey E. Hinton, and Graham W. Taylor. "The recurrent temporal restricted boltzmann machine." Advances in neural information processing systems. 2009.
[3] Lotter, William, Gabriel Kreiman, and David Cox. "Unsupervised learning of visual structure using predictive generative networks." arXiv preprint arXiv:1511.06380 (2015).
[4] Cenzato, Alberto, Alberto Testolin, and Marco Zorzi. "On the difficulty of learning and predicting the long-term dynamics of bouncing objects." arXiv preprint arXiv:1907.13494 (2019).
0 Comments