Machine Learning

Letter-Based Speech Recognition with Gated ConvNets

This topic contains 0 replies, has 1 voice, and was last updated by  arXiv 11 months, 3 weeks ago.


  • arXiv
    5 pts

    Letter-Based Speech Recognition with Gated ConvNets

    In this paper we introduce a new speech recognition system, leveraging a simple letter-based ConvNet acoustic model. The acoustic model requires — only audio transcription for training — no alignment annotations, nor any forced alignment step is needed. At inference, our decoder takes only a word list and a language model, and is fed with letter scores from the — acoustic model — no phonetic word lexicon is needed. Key ingredients for the acoustic model are Gated Linear Units and high dropout. We show near state-of-the-art results in word error rate on the LibriSpeech corpus using log-mel filterbanks, both on the “clean” and “other” configurations.

    Letter-Based Speech Recognition with Gated ConvNets
    by Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert
    https://arxiv.org/pdf/1712.09444v1.pdf

You must be logged in to reply to this topic.