As machine learning models grow increasingly complex, we suggest that neural networks are best viewed as an emerging, differentiable programming paradigm, and ask what decades of research into programming languages and compilers has to offer to the machine learning world. Julia is an ideal platform not only for this kind of high-performance, numerical programming, but for the research into compiler and language features needed to push the state of the art forward. Features such as wide hardware support (GPUs, TPUs), kernel fusion, compiler-level automatic differentiation, push-button compilation and deployment and elegant mathematical syntax are all invaluable to researchers. This talk will explain how we are able to take the Julia compiler to its limits and push forward the state of the art in the machine learning world.