From Deep Neural Networks To Fully Adaptive Programs

Deep neural networks have been tremendously successful in approaching a variety of learning problems. Nonetheless, the basic building blocks that make up such models are mostly designed to facilitate gradient based parameter learning, rather than to enable productive programming, or to provide a fully fledged programming language. Standard programming languages on the other hand provide programming elements such as variables, conditional execution clauses, and definitions of subroutines that facilitate succinct descriptions of programs, but lack the flexibility that gradient based learning allows. We set on combining the best of both worlds in a new Julia framework that on the one hand supports structured programming elements, and on the other, supports gradient based parameter learning. The main challenge in combining the two is to provide an efficient algorithm for calculating the gradient when variables, and loops are involved. In such cases, the complexity of calculating the gradient using standard backpropagation grows exponentially with the number of iterations and the number of variables. To circumvent this problem, we present a new auto-differentiation mechanism that can handle such cases in linear time, and a new framework that supports it. We supplement our framework with operations that are typical of neural networks, paving the way to new kinds of adaptive programs. In particular, we show that convolutional neural networks that incorporate variables and loops can solve image classification problems with significantly less parameters, and more importantly, solve image classification tasks that standard neural networks fail to solve.

From Deep Neural Networks To Fully Adaptive Programs

Speaker's bio