#### Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems

Neural models have become ubiquitous in automatic speech recognition systems. While neural networks are typically used as acoustic models in more complex systems, recent studies have explored end-to-end speech recognition systems based on neural networks, which can be trained to directly predict te…

#### Similarity Embedding Network for Unsupervised Sequential Pattern Learning by Playing Music Puzzle Games

Real-world time series data are rich in sequential and structural patterns. Music, for example, often have a multi-level organization of musical events, with higher-level building blocks made up of smaller recurrent patterns. For computers to understand and process such time series data, we need a …

#### Learning with Opponent-Learning Awareness

Generative Adversarial Network gradient Reinforcement Learning RNN

Multi-agent settings are quickly gathering importance in machine learning. Beyond a plethora of recent work on deep multi-agent reinforcement learning, hierarchical reinforcement learning, generative adversarial networks and decentralized optimization can all be seen as instances of this setting. H…

#### Biased Importance Sampling for Deep Neural Network Training

CIFAR Convolutional Neural Network DNN gradient image language RNN

Importance sampling has been successfully used to accelerate stochastic optimization in many convex problems. However, the lack of an efficient way to calculate the importance still hinders its application to Deep Learning. In this paper, we show that the loss value can be used as an alternative im…

#### Assessing State-of-the-Art Sentiment Models on State-of-the-Art Sentiment Datasets

There has been a good amount of progress in sentiment analysis over the past 10 years, including the proposal of new methods and the creation of benchmark datasets. In some papers, however, there is a tendency to compare models only on one or two datasets, either because of time restraints or becau…

#### Parallelizing Linear Recurrent Neural Nets Over Sequence Length

Recurrent neural networks (RNNs) are widely used to model sequential data but their non-linear dependencies between sequence elements prevent parallelizing training over sequence length. We show the training of RNNs with only linear sequential dependencies can be parallelized over the sequence leng…

#### Shifting Mean Activation Towards Zero with Bipolar Activation Functions

We propose a simple extension to the ReLU-family of activation functions that allows them to shift the mean activation across a layer towards zero. Combined with proper weight initialization, this alleviates the need for normalization layers. We explore the training of deep vanilla recurrent neural…

#### Affective Neural Response Generation

Existing neural conversational models process natural language primarily on a lexico-syntactic level, thereby ignoring one of the most crucial components of human-to-human dialogue: its affective content. We take a step in this direction by proposing three novel ways to incorporate affective/emotio…

#### Event Representations for Automated Story Generation with Deep Neural Nets

Automated story generation is the problem of automatically selecting a sequence of events, actions, or words that can be told as a story. We seek to develop a system that can generate stories by learning everything it needs to know from textual story corpora. To date, recurrent neural networks that…

#### Refining Source Representations with Relation Networks for Neural Machine Translation

Although neural machine translation (NMT) with the encoder-decoder framework has achieved great success in recent times, it still suffers from some drawbacks: RNNs tend to forget old information which is often useful and the encoder only operates through words without considering word relationship.…

#### Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets

Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing bot…

#### RRA: Recurrent Residual Attention for Sequence Learning

In this paper, we propose a recurrent neural network (RNN) with residual attention (RRA) to learn long-range dependencies from sequential data. We propose to add residual connections across timesteps to RNN, which explicitly enhances the interaction between current state and hidden states that are …

#### A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data

car gradient LSTM RNN Support Vector Machine SVM

Gated Recurrent Unit (GRU) is a recently published variant of the Long Short-Term Memory (LSTM) network, designed to solve the vanishing gradient and exploding gradient problems. However, its main objective is to solve the long-term dependency problem in Recurrent Neural Networks (RNNs), which prev…

#### Deconvolutional Paragraph Representation Learning

Generative Adversarial Network language RNN text

Learning latent representations from long text sequences is an important first step in many natural language processing applications. Recurrent Neural Networks (RNNs) have become a cornerstone for this challenging task. However, the quality of sentences during RNN-based decoding (reconstruction) de…

#### Training RNNs as Fast as CNNs

Recurrent neural networks scale poorly due to the intrinsic difficulty in parallelizing their state computations. For instance, the forward pass computation of $h_t$ is blocked until the entire computation of $h_{t-1}$ finishes, which is a major bottleneck for parallel computing. In this work, we p…

#### Cycles in adversarial regularized learning

Regularized learning is a fundamental technique in online optimization, machine learning and many other fields of computer science. A natural question that arises in these settings is how regularized learning algorithms behave when faced against each other. We study a natural formulation of this pr…

#### Recurrent Ladder Networks

We propose a recurrent extension of the Ladder networks whose structure is motivated by the inference required in hierarchical latent variable models. We demonstrate that the recurrent Ladder is able to handle a wide variety of complex learning tasks that benefit from iterative inference and tempor…

#### A Real-time Trainable and Clock-less Spiking Neural Network with 1R Memristive Synapses

Owing to their unexplored but potentially superior computational capability and remarkably low power consumption for executing brain-like tasks, spiking neural networks (SNNs) have come under the scope of several research groups. For modeling a network of spiking neurons and synapses highly paralle…

#### Learned Optimizers that Scale and Generalize

Convolutional Neural Network gradient IMAGENET RNN

Learning to learn has emerged as an important direction for achieving artificial intelligence. Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks. We introduce a learned gradient descent optimizer that generalize…

#### RNN-based Early Cyber-Attack Detection for the Tennessee Eastman Process

An RNN-based forecasting approach is used to early detect anomalies in industrial multivariate time series data from a simulated Tennessee Eastman Process (TEP) with many cyber-attacks. This work continues a previously proposed LSTM-based approach to the fault detection in simpler data. It is consi…

#### Approximating meta-heuristics with homotopic recurrent neural networks

Much combinatorial optimisation problems constitute a non-polynomial (NP) hard optimisation problem, i.e., they can not be solved in polynomial time. One such problem is finding the shortest route between two nodes on a graph. Meta-heuristic algorithms such as $A^{*}$ along with mixed-integer progr…

#### Self-Normalizing Neural Networks

Convolutional Neural Network DNN image language RNN Support Vector Machine

Deep Learning has revolutionized vision via convolutional neural networks (CNNs) and natural language processing via recurrent neural networks (RNNs). However, success stories of Deep Learning with standard feed-forward neural networks (FNNs) are rare. FNNs that perform well are typically shallow a…

#### Interacting Attention-gated Recurrent Networks for Recommendation

Capturing the temporal dynamics of user preferences over items is important for recommendation. Existing methods mainly assume that all time steps in user-item interaction history are equally relevant to recommendation, which however does not apply in real-world scenarios where user-item interactio…

#### Deep Residual Bidir-LSTM for Human Activity Recognition Using Wearable Sensors

Human activity recognition (HAR) has become a popular topic in research because of its wide application. With the development of deep learning, new ideas have appeared to address HAR problems. Here, a deep network architecture using residual bidirectional long short-term memory (LSTM) cells is prop…

#### Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge

ads car DQN LSTM Reinforcement Learning RNN

Multiple automakers have in development or in production automated driving systems (ADS) that offer freeway-pilot functions. This type of ADS is typically limited to restricted-access freeways only, that is, the transition from manual to automated modes takes place only after the ramp merging proce…