Topic Tag: LSTM

home Forums Topic Tag: LSTM

 Learning Differentially Private Language Models Without Losing Accuracy

   

We demonstrate that it is possible to train large recurrent language models with user-level differential privacy guarantees without sacrificing predictive accuracy. Our work builds on recent advances in the training of deep networks on user-partitioned data and privacy accounting for stochastic gra…


 Learning to Transfer Initializations for Bayesian Hyperparameter Optimization

   

Hyperparameter optimization undergoes extensive evaluations of validation errors in order to find the best configuration of hyperparameters. Bayesian optimization is now popular for hyperparameter optimization, since it reduces the number of validation error evaluations required. Suppose that we ar…


 Multiplicative LSTM for sequence modelling

  

We introduce multiplicative LSTM (mLSTM), a recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures. mLSTM is characterised by its ability to have different recurrent transition functions …


 Discrete Event, Continuous Time RNNs

 

We investigate recurrent neural network architectures for event-sequence processing. Event sequences, characterized by discrete observations stamped with continuous-valued times of occurrence, are challenging due to the potentially wide dynamic range of relevant time scales as well as interactions …


 Optimizing Long Short-Term Memory Recurrent Neural Networks Using Ant Colony Optimization to Predict Turbine Engine Vibration

 

This article expands on research that has been done to develop a recurrent neural network (RNN) capable of predicting aircraft engine vibrations using long short-term memory (LSTM) neurons. LSTM RNNs can provide a more generalizable and robust method for prediction over analytical calculations of e…


 Network of Recurrent Neural Networks

 

We describe a class of systems theory based neural networks called “Network Of Recurrent neural networks” (NOR), which introduces a new structure level to RNN related models. In NOR, RNNs are viewed as the high-level neurons and are used to build the high-level layers. More specifically…


 Forecasting Across Time Series Databases using Long Short-Term Memory Networks on Groups of Similar Series

  

With the advent of Big Data, nowadays in many applications databases containing large quantities of similar time series are available. Forecasting time series in these domains with traditional univariate forecasting procedures leaves great potentials for producing accurate forecasts untapped. Recur…


 Recurrent Network-based Deterministic Policy Gradient for Solving Bipedal Walking Challenge on Rugged Terrains

   

This paper presents the learning algorithm based on the Recurrent Network-based Deterministic Policy Gradient. The Long-Short Term Memory is utilized to enable the Partially Observed Markov Decision Process framework. The novelty are improvements of LSTM networks: update of multi-step temporal diff…


 To prune, or not to prune: exploring the efficacy of pruning for model compression

   

Model pruning seeks to induce sparsity in a deep neural network’s various connection matrices, thereby reducing the number of nonzero-valued parameters in the model. Recent reports (Han et al., 2015; Narang et al., 2017) prune deep networks at the cost of only a marginal loss in accuracy and …


 Learning Scalable Deep Kernels with Recurrent Structure

     

Many applications in speech, robotics, finance, and biology deal with sequential data, where ordering matters and recurrent structures are common. However, this structure cannot be easily captured by standard kernel functions. To model such structure, we propose expressive closed-form kernel functi…


 LSTM: A Search Space Odyssey

  

Several variants of the Long Short-Term Memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in …


 DeepTFP: Mobile Time Series Data Analytics based Traffic Flow Prediction

   

Traffic flow prediction is an important research issue to avoid traffic congestion in transportation systems. Traffic congestion avoiding can be achieved by knowing traffic flow and then conducting transportation planning. Achieving traffic flow prediction is challenging as the prediction is affect…


 Fine-grained Event Learning of Human-Object Interaction with LSTM-CRF

  

Event learning is one of the most important problems in AI. However, notwithstanding significant research efforts, it is still a very complex task, especially when the events involve the interaction of humans or agents with other objects, as it requires modeling human kinematics and object movement…


 Improving speech recognition by revising gated recurrent units

   

Speech recognition is largely taking advantage of deep learning, showing that substantial benefits can be obtained by modern Recurrent Neural Networks (RNNs). The most popular RNNs are Long Short-Term Memory (LSTMs), which typically reach state-of-the-art performance in many tasks thanks to their a…


 Meta-SGD: Learning to Learn Quickly for Few-Shot Learning

  

Few-shot learning is challenging for learning algorithms that learn each task in isolation and from scratch. In contrast, meta-learning learns from many related tasks a meta-learner that can learn a new task more accurately and faster with fewer examples, where the choice of meta-learners is crucia…


 House Price Prediction Using LSTM

 

In this paper, we use the house price data ranging from January 2004 to October 2016 to predict the average house price of November and December in 2016 for each district in Beijing, Shanghai, Guangzhou and Shenzhen. We apply Autoregressive Integrated Moving Average model to generate the baseline w…


 Cross-modal Recurrent Models for Human Weight Objective Prediction from Multimodal Time-series Data

  

We analyse multimodal time-series data corresponding to weight, sleep and steps measurements, derived from a dataset spanning 15000 users, collected across a range of consumer-grade health devices by Nokia Digital Health – Withings. We focus on predicting whether a user will successfully achi…


 Gated Graph Sequence Neural Networks

  

Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs. Our starting point is previous work on Graph Neural Networks (Scarselli et al., …


 Deep Recurrent NMF for Speech Separation by Unfolding Iterative Thresholding

  

In this paper, we propose a novel recurrent neural network architecture for speech separation. This architecture is constructed by unfolding the iterations of a sequential iterative soft-thresholding algorithm (ISTA) that solves the optimization problem for sparse nonnegative matrix factorization (…


 Deconvolutional Latent-Variable Model for Text Sequence Matching

  

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we employ deconvolutional networks as the sequence decoder (ge…


 Language modeling with Neural trans-dimensional random fields

    

Trans-dimensional random field language models (TRF LMs) have recently been introduced, where sentences are modeled as a collection of random fields. The TRF approach has been shown to have the advantages of being computationally more efficient in inference than LSTM LMs with close performance and …


 Self-Guiding Multimodal LSTM – when we do not have a perfect training dataset for image captioning

     

In this paper, a self-guiding multimodal LSTM (sg-LSTM) image captioning model is proposed to handle uncontrolled imbalanced real-world image-sentence dataset. We collect FlickrNYC dataset from Flickr as our testbed with 306,165 images and the original text descriptions uploaded by the users are ut…


 Learning Intrinsic Sparse Structures within Long Short-term Memory

  

Model compression is significant for wide adoption of Recurrent Neural Networks (RNNs) in both user devices possessing limited resources and in business clusters requiring quick responses to large-scale service requests. In this work, we focus on reducing the sizes of basic structures (including in…


 Deep Learning for Automatic Stereotypical Motor Movement Detection using Wearable Sensors in Autism Spectrum Disorders

   

Autism Spectrum Disorders are associated with atypical movements, of which stereotypical motor movements (SMMs) interfere with learning and social interaction. The automatic SMM detection using inertial measurement units (IMU) remains complex due to the strong intra and inter-subject variability, e…


 Assessing State-of-the-Art Sentiment Models on State-of-the-Art Sentiment Datasets

 

There has been a good amount of progress in sentiment analysis over the past 10 years, including the proposal of new methods and the creation of benchmark datasets. In some papers, however, there is a tendency to compare models only on one or two datasets, either because of time restraints or becau…