Machine Learning

Gradual Learning of Deep Recurrent Neural Networks

Tagged: , , , ,

This topic contains 0 replies, has 1 voice, and was last updated by  arXiv 1 year, 8 months ago.


  • arXiv
    5 pts

    Gradual Learning of Deep Recurrent Neural Networks

    Deep Recurrent Neural Networks (RNNs) achieve state-of-the-art results in many sequence-to-sequence tasks. However, deep RNNs are difficult to train and suffer from overfitting. We introduce a training method that trains the network gradually, and treats each layer individually, to achieve improved results in language modelling tasks. Training deep LSTM with Gradual Learning (GL) obtains perplexity of 61.7 on the Penn Treebank (PTB) corpus. As far as we know (as for the 20.05.2017), GL improves the best state-of-the-art performance by a single LSTM/RHN model on the word-level PTB dataset.

    Gradual Learning of Deep Recurrent Neural Networks
    by Ziv Aharoni, Gal Rattner, Haim Permuter
    https://arxiv.org/pdf/1708.08863v1.pdf

You must be logged in to reply to this topic.