This talk was first given as a ICML 2015 keynote.
It describes and discusses two serious and related challenges facing machine learning.
1) Machine learning technologies are increasingly used in complex software systems such as those underlying internet services today or self driving vehicles tomorrow. Despite famous successes, there is more and more evidence that machine learning components tend to disrupt established software engineering practices. I will present examples and offer an explanation of this annoying and often very costly effect. Our first high-stake challenge consists therefore in formulating sound and efficient engineering principles for machine learning applications.
2) Machine learning research can often be viewed as an empirical science. Unlike nearly all other empirical sciences, progress in machine learning has largely been driven by a single experimental paradigm: fitting a training set and reporting performance on a testing set. Three forces may terminate this convenient state of affairs: the first one is the engineering challenge outlined above, the second one arises from the statistics of large-scale datasets, and the third one is our growing ambition to address more serious AI tasks. Our second high-stakes challenge consists therefore in enriching our experimental repertoire, redefining our scientific processes, and still maintain our progress speed.