leon.bottou.org

Sequence Labelling SVMs Trained in One Pass

Abstract: This paper proposes an online solver of the dual formulation of support vector machines for structured output spaces. We apply it to sequence labelling using the exact and greedy inference schemes. In both cases, the per-sequence training time is the same as a perceptron based on the same inference procedure, up to a small multiplicative constant. Comparing the two inference schemes, the greedy version is much faster. It is also amenable to higher order Markov assumptions and performs similarly on test. In comparison to existing algorithms, both versions match the accuracies of batch solvers that use exact inference after a single pass over the training examples.

Antoine Bordes, Nicolas Usunier and Léon Bottou: Sequence Labelling SVMs Trained in One Pass, Machine Learning and Knowledge Discovery in Databases: ECML PKDD 2008, 146-161, Edited by Walter Daelemans, Bart Goethals and Katharina Morik, Lecture Notes in Computer Science, LNCS 5211, Springer, 2008.

ecml-2008.djvu ecml-2008.pdf ecml-2008.ps.gz

@inproceedings{bordes-usunier-bottou-2008,
  author = {Bordes, Antoine and Usunier, Nicolas and Bottou, L\'{e}on},
  title = {Sequence Labelling SVMs Trained in One Pass},
  booktitle = {Machine Learning and Knowledge Discovery in Databases: ECML PKDD 2008},
  year = {2008},
  editor = {Daelemans, Walter and Goethals, Bart and Morik, Katharina},
  series = {Lecture Notes in Computer Science, LNCS~5211},
  pages = {146-161},
  publisher = {Springer},
  url = {http://leon.bottou.org/papers/bordes-usunier-bottou-2008},
}

Implementation

Antoine Bordes provides an implementation of the LaRank algorithm which was in fact written for this paper. It also contains a special case for linear kernels.