According to Hinton Practical 2012, p. 608
Start with momentum of 0.5. Once the large initial progress in reduction of the reconstruction error has settled down to gentle progress, increase the momentum to 0.9. This shock may cause a transient increase in the reconstruction error. If this causes a more lasting instability, keep reducing the learning rate by factors of 2 until the instability disappears.