Date: 2016-11-18

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Deep learning has arisen around 2006 as a renewal of neural networks research allowing such models to have more layers. Theoretical investigations have shown that functions obtained as deep compositions of simpler functions (which includes both deep and recurrent nets) can express highly varying functions (with many ups and downs and different input regions that can be distinguished) much more efficiently (with fewer parameters) than otherwise, under a prior which seems to work well for artificial intelligence tasks. Empirical work in a variety of applications has demonstrated that, when well trained, such deep architectures can be highly successful, remarkably breaking through previous state-of-the-art in many areas, including speech recognition, object recognition, language models, machine translation and transfer learning. Although neural networks have long been considered lacking in theory and much remains to be done, theoretical advances have been made and will be discussed, to support distributed representations, depth of representation, the non-convexity of the training objective, and the probabilistic interpretation of learning algorithms (especially of the auto-encoder type, which were lacking one). The talk will focus on the intuitions behind these theoretical results.

Speaker

Yoshua Bengio is a Professor of the Department of Computer Science and Operations Research at the University of Montreal, head of the Montreal Institute for Learning Algorithms (MILA), CIFAR Program co-director of the CIFAR Neural Computation and Adaptive Perception program, Canada Research Chair in Statistical Learning Algorithms.