User Tools

Site Tools


This is an old revision of the document!


From Machine Learning to Machine Reasoning

Abstract: A plausible definition of “reasoning” could be “algebraically manipulating previously acquired knowledge in order to answer a new question”. This definition covers first-order logical inference or probabilistic inference. It also includes the simpler manipulations commonly used to build large learning systems. For instance, we can build an optical character recognition system by first training a character segmenter, an isolated character recognizer, and a language model, using appropriate labeled training sets. Adequately concatenating these modules and fine tuning the resulting system can be viewed as an algebraic operation in a space of models. The resulting model answers a new question, that is, converting the image of a text page into a computer readable text. This observation suggests a conceptual continuity between algebraically rich inference systems, such as logical or probabilistic inference, and simple manipulations, such as the mere concatenation of trainable learning systems. Therefore, instead of trying to bridge the gap between machine learning systems and sophisticated “all-purpose” inference mechanisms, we can instead algebraically enrich the set of manipulations applicable to training systems, and build reasoning capabilities from the ground up.

Léon Bottou: From Machine Learning to Machine Reasoning, arXiv:1102.1808, February 2011.


arXiv link tr-2011-02-08.djvu tr-2011-02-08.pdf tr-2011-02-08.ps.gz

@techreport{tr-bottou-2011,
  author = {Bottou, L\'eon},
  title = {From Machine Learning to Machine Reasoning},
  institution = {arXiv.1102.1808},
  month = {February},
  year = {2011},
  url = {http://leon.bottou.org/papers/tr-bottou-2011},
}

A revision of this text was published in 2014 (MLJ).

Notes

This documents cite the work of Vincent Etter (2009) carried out during his NEC Labs internship. Vincent's master report is now available on his home page (local copy). Section 5 is an exploration of that were extensively discussed between Ronan Collobert, Jason Weston and I. We had the hope to discover relevant recursive sentence representation in an unsupervised manner. Alas we found that the shape of the structure of a recursive network has very little impact on its representation abilities, something that was confirmed by Scheible and Schütze on a sentiment classification task. Even a left-to-right tree (which amounts to using a recurrent neural network in fact) worked essentially as well, something that was later confirmed by Li et al. on a variety of tasks. Although we still had hopes to make it work in 2009-2010, I now believe that structure discovery needs a completely different approach on the cost functions.

papers/tr-bottou-2011.1473687214.txt.gz · Last modified: 2016/09/12 09:33 by leonb

Page Tools