Machine Learning

Overcoming catastrophic forgetting with hard attention to the task

Tagged: ,

This topic contains 0 replies, has 1 voice, and was last updated by  arXiv 1 year, 6 months ago.


  • arXiv
    5 pts

    Overcoming catastrophic forgetting with hard attention to the task

    Catastrophic forgetting occurs when a neural network loses the information learned with the first task, after training on a second task. This problem remains a hurdle for general artificial intelligence systems with sequential learning capabilities. In this paper, we propose a task-based hard attention mechanism that preserves previous tasks’ information without substantially affecting the current task’s learning. An attention mask is learned concurrently to every task through stochastic gradient descent, and previous masks are exploited to constrain such learning. We show that the proposed mechanism is effective for reducing catastrophic forgetting, cutting current rates by 33 to 84%. We also show that it is robust to different hyperparameter choices and that it offers a number of monitoring capabilities. The approach features the possibility to control both the stability and compactness of the learned knowledge, which we believe makes it also attractive for online learning and network compression applications.

    Overcoming catastrophic forgetting with hard attention to the task
    by Joan Serrà, Dídac Surís, Marius Miron, Alexandros Karatzoglou
    https://arxiv.org/pdf/1801.01423v1.pdf

You must be logged in to reply to this topic.