Differences

This shows you the differences between two versions of the page.

--- research:largescale [2012/12/24 11:43]
leonb [Approximate Optimization]
+++ research:largescale [2013/02/25 09:57] (current)
leonb [Papers]
@@ Line 49: / Line 49: @@
 ===== Approximate Optimization =====
+{{ wall2.png}}
 Large-scale machine learning was first approached as an engineering problem. For instance, to leverage a
 larger training set, we can use a parallel computer to run a known machine learning algorithm
@@ Line 54: / Line 55: @@
 objective function. Such approaches rely on the appealing
 assumption that one can decouple the statistical aspects from the computational aspects of the machine
-learning problem. My [[:talks/largescale|NIPS 2007 tutorial]] made clear
+learning problem.
-that this assumption is incorrect, and that giving it up leads to considerably
-more effective learning algorithms.
-The [[:papers/bottou-bousquet-2008|corresponding paper]]
+This work shows that this assumption is incorrect, and that giving it up leads to considerably
-develops a theoretical framework
+more effective learning algorithms. A new theoretical framework
-that takes into account the effect of approximate
+takes into account the effect of approximate
 optimization on learning algorithms.
 The analysis shows distinct tradeoffs for the
 case of small-scale and large-scale learning problems.
@@ Line 70: / Line 70: @@
 complexity of the underlying optimization
 algorithms in non-trivial ways.
-For instance, a mediocre optimization algorithms,
+For instance, [[:research:stochastic|Stochastic Gradient Descent (SGD)]] algorithms
-[[:research:stochastic|stochastic gradient descent]],
+appear to be mediocre optimization algorithms and yet are shown to
-is shown to perform very well on large-scale learning problems.
+[[:projects/sgd|perform extremely well]] on large-scale learning problems.
+===== Tutorials =====
+  * NIPS 2007 tutorial "[[:talks/largescale|Large Scale Learning]]".
+===== Related =====
+   * [[:research:stochastic|Stochastic gradient learning algorithms]]
+===== Papers =====
-===== Stochastic Gradient for Large-Scale Learning =====
+<box 99% orange>
+Léon Bottou and Olivier Bousquet:  **The Tradeoffs of Large Scale Learning**,
+//Advances in Neural Information Processing Systems//, 20,
+MIT Press, Cambridge, MA, 2008.
-[[stochastic]]
+[[:papers/bottou-bousquet-2008|more...]]
+</box>
+<box 99% orange>
+Léon Bottou and Yann LeCun:  **On-line Learning for Very Large Datasets**,  //Applied Stochastic Models in Business and Industry//, 21(2):137-151, 2005.
+[[:papers/bottou-lecun-2004a|more...]]
+</box>
+<box 99% orange>
+Léon Bottou:  **Online Algorithms and Stochastic Approximations**,  //Online Learning and Neural Networks//, Edited by David Saad, Cambridge University Press, Cambridge, UK, 1998.
+[[:papers/bottou-98x|more...]]
+</box>
-===== Active Learning =====
+<box 99% blue>
+Léon Bottou:  //**Une Approche théorique de l'Apprentissage Connexionniste: Applications à la Reconnaissance de la Parole**//, Orsay, France, 1991.
-One simple way to handle large-scale learning problems is to chose examples wisely.
+[[:papers/bottou-91a|more...]]
-This idea was explored in our work on [[lasvm|Active and Online Support Vector Machines]].
+</box>
-But there is still much work to do about active learning as a way to handle very large data repositories.

User Tools

Site Tools

Differences

Page Tools