Abstract: We propose a deep learning strategy for scene parsing, i.e. to asssign a class label to each pixel of an image. We investigate the use of deep convolutional network for modeling the complex scene label structures, relying on a supervised greedy learning strategy. Compared to standard approaches based on CRFs, our strategy does not need hand-crafted features, allows modeling more complex spatial dependencies and has a lower inference cost. Experiments over the MSRC benchmark and the LabelMe dataset show the eectiveness of our approach
@misc{grangier-bottou-collobert-2009, author = {Grangier, David and Bottou, L\'{e}on and Collobert, Ronan}, title = {Deep Convolutional Networks for Scene Parsing}, howpublished = {ICML 2009 Workshop on Learning Feature Hierarchies}, month = {June}, year = {2009}, url = {http://leon.bottou.org/papers/grangier-bottou-collobert-2009}, }