This 3h00 lecture on neural networks was given at the Tübingen Machine Learning Summer School (MLSS) in 2013.
Before covering the basics of multilayer networks and back-propagation, Part 1 presents a justification for the neural approach, following McClelland and Rogers (2003). Part 2 then covers optimization applied to neural networks, including simple but effective tricks to set per-layer learning rates and per-layer weight initializations (slides 88 and on). Part 3 then elaborates on complex problems, using complex systems such as graph transformer networks, using feature transfer between related task, and finally using recursive networks. My current view of recursive networks is now more pessimistic than it was then.