Topic Tag: LSTM

home Forums Topic Tag: LSTM

 Language modeling with Neural trans-dimensional random fields

    

Trans-dimensional random field language models (TRF LMs) have recently been introduced, where sentences are modeled as a collection of random fields. The TRF approach has been shown to have the advantages of being computationally more efficient in inference than LSTM LMs with close performance and …


 Self-Guiding Multimodal LSTM – when we do not have a perfect training dataset for image captioning

     

In this paper, a self-guiding multimodal LSTM (sg-LSTM) image captioning model is proposed to handle uncontrolled imbalanced real-world image-sentence dataset. We collect FlickrNYC dataset from Flickr as our testbed with 306,165 images and the original text descriptions uploaded by the users are ut…


 Learning Intrinsic Sparse Structures within Long Short-term Memory

  

Model compression is significant for wide adoption of Recurrent Neural Networks (RNNs) in both user devices possessing limited resources and in business clusters requiring quick responses to large-scale service requests. In this work, we focus on reducing the sizes of basic structures (including in…


 Deep Learning for Automatic Stereotypical Motor Movement Detection using Wearable Sensors in Autism Spectrum Disorders

   

Autism Spectrum Disorders are associated with atypical movements, of which stereotypical motor movements (SMMs) interfere with learning and social interaction. The automatic SMM detection using inertial measurement units (IMU) remains complex due to the strong intra and inter-subject variability, e…


 Assessing State-of-the-Art Sentiment Models on State-of-the-Art Sentiment Datasets

 

There has been a good amount of progress in sentiment analysis over the past 10 years, including the proposal of new methods and the creation of benchmark datasets. In some papers, however, there is a tendency to compare models only on one or two datasets, either because of time restraints or becau…


 Parallelizing Linear Recurrent Neural Nets Over Sequence Length

 

Recurrent neural networks (RNNs) are widely used to model sequential data but their non-linear dependencies between sequence elements prevent parallelizing training over sequence length. We show the training of RNNs with only linear sequential dependencies can be parallelized over the sequence leng…


 Affective Neural Response Generation

  

Existing neural conversational models process natural language primarily on a lexico-syntactic level, thereby ignoring one of the most crucial components of human-to-human dialogue: its affective content. We take a step in this direction by proposing three novel ways to incorporate affective/emotio…


 RRA: Recurrent Residual Attention for Sequence Learning

   

In this paper, we propose a recurrent neural network (RNN) with residual attention (RRA) to learn long-range dependencies from sequential data. We propose to add residual connections across timesteps to RNN, which explicitly enhances the interaction between current state and hidden states that are …


 A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data

     

Gated Recurrent Unit (GRU) is a recently published variant of the Long Short-Term Memory (LSTM) network, designed to solve the vanishing gradient and exploding gradient problems. However, its main objective is to solve the long-term dependency problem in Recurrent Neural Networks (RNNs), which prev…


 Training RNNs as Fast as CNNs

   

Recurrent neural networks scale poorly due to the intrinsic difficulty in parallelizing their state computations. For instance, the forward pass computation of $h_t$ is blocked until the entire computation of $h_{t-1}$ finishes, which is a major bottleneck for parallel computing. In this work, we p…


 Deep Residual Bidir-LSTM for Human Activity Recognition Using Wearable Sensors

   

Human activity recognition (HAR) has become a popular topic in research because of its wide application. With the development of deep learning, new ideas have appeared to address HAR problems. Here, a deep network architecture using residual bidirectional long short-term memory (LSTM) cells is prop…


 Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge

     

Multiple automakers have in development or in production automated driving systems (ADS) that offer freeway-pilot functions. This type of ADS is typically limited to restricted-access freeways only, that is, the transition from manual to automated modes takes place only after the ramp merging proce…


 Neural Probabilistic Model for Non-projective MST Parsing

  

In this paper, we propose a probabilistic parsing model, which defines a proper conditional probability distribution over non-projective dependency trees for a given sentence, using neural representations as inputs. The neural network architecture is based on bi-directional LSTM-CNNs which benefits…


 Modelling Protagonist Goals and Desires in First-Person Narrative

   

Many genres of natural language text are narratively structured, a testament to our predilection for organizing our experiences as narratives. There is broad consensus that understanding a narrative requires identifying and tracking the goals and desires of the characters and their narrative outcom…


 Gradual Learning of Deep Recurrent Neural Networks

  

Deep Recurrent Neural Networks (RNNs) achieve state-of-the-art results in many sequence-to-sequence tasks. However, deep RNNs are difficult to train and suffer from overfitting. We introduce a training method that trains the network gradually, and treats each layer individually, to achieve improved…


 Learning the Enigma with Recurrent Neural Networks

    

Recurrent neural networks (RNNs) represent the state of the art in translation, image captioning, and speech recognition. They are also capable of learning algorithmic tasks such as long addition, copying, and sorting from a set of training examples. We demonstrate that RNNs can learn decryption al…


 Neural Networks Compression for Language Modeling

  

In this paper, we consider several compression techniques for the language modeling problem based on recurrent neural networks (RNNs). It is known that conventional RNNs, e.g, LSTM-based networks in language modeling, are characterized with either high space complexity or substantial inference time…