Topic Tag: text

home Forums Topic Tag: text

 Deconvolutional Latent-Variable Model for Text Sequence Matching

  

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we employ deconvolutional networks as the sequence decoder (ge…


 Text Compression for Sentiment Analysis via Evolutionary Algorithms

Can textual data be compressed intelligently without losing accuracy in evaluating sentiment? In this study, we propose a novel evolutionary compression algorithm, PARSEC (PARts-of-Speech for sEntiment Compression), which makes use of Parts-of-Speech tags to compress text in a way that sacrifices m…


 Neural Networks for Text Correction and Completion in Keyboard Decoding

    

Despite the ubiquity of mobile and wearable text messaging applications, the problem of keyboard text decoding is not tackled sufficiently in the light of the enormous success of the deep learning Recurrent Neural Network (RNN) and Convolutional Neural Networks (CNN) for natural language understand…


 GaKCo: a Fast GApped k-mer string Kernel using COunting

String Kernel (SK) techniques, especially those using gapped $k$-mers as features (gk), have obtained great success in classifying sequences like DNA, protein, and text. However, the state-of-the-art gk-SK runs extremely slow when we increase the dictionary size ($Sigma$) or allow more mismatches (…


 Leveraging Distributional Semantics for Multi-Label Learning

 

We present a novel and scalable label embedding framework for large-scale multi-label learning a.k.a ExMLDS (Extreme Multi-Label Learning using Distributional Semantics). Our approach draws inspiration from ideas rooted in distributional semantics, specifically the Skip Gram Negative Sampling (SGNS…


 Depression Scale Recognition from Audio, Visual and Text Analysis

      

Depression is a major mental health disorder that is rapidly affecting lives worldwide. Depression not only impacts emotional but also physical and psychological state of the person. Its symptoms include lack of interest in daily activities, feeling low, anxiety, frustration, loss of weight and eve…


 Word Vector Enrichment of Low Frequency Words in the Bag-of-Words Model for Short Text Multi-class Classification Problems

The bag-of-words model is a standard representation of text for many linear classifier learners. In many problem domains, linear classifiers are preferred over more complex models due to their efficiency, robustness and interpretability, and the bag-of-words text representation can capture sufficie…


 Unsupervised, Efficient and Semantic Expertise Retrieval

We introduce an unsupervised discriminative model for the task of retrieving experts in online document collections. We exclusively employ textual evidence and avoid explicit feature engineering by learning distributed word representations in an unsupervised way. We compare our model to state-of-th…


 A segmental framework for fully-unsupervised large-vocabulary speech recognition

    

Zero-resource speech technology is a growing research area that aims to develop methods for speech processing in the absence of transcriptions, lexicons, or language modelling text. Early term discovery systems focused on identifying isolated recurring patterns in a corpus, while more recent full-c…


 Disentangled Variational Auto-Encoder for Semi-supervised Learning

  

In this paper, we develop a novel approach for semi-supervised VAE without classifier. Specifically, we propose a new model called SDVAE, which encodes the input data into disentangled representation and non-interpretable representation, then the category information is directly utilized to regular…


 Self-Guiding Multimodal LSTM – when we do not have a perfect training dataset for image captioning

     

In this paper, a self-guiding multimodal LSTM (sg-LSTM) image captioning model is proposed to handle uncontrolled imbalanced real-world image-sentence dataset. We collect FlickrNYC dataset from Flickr as our testbed with 306,165 images and the original text descriptions uploaded by the users are ut…


 An Automated Text Categorization Framework based on Hyperparameter Optimization

 

A great variety of text tasks such as topic or spam identification, user profiling, and sentiment analysis can be posed as a supervised learning problem and tackle using a text classifier. A text classifier consists of several subprocesses, some of them are general enough to be applied to any super…


 Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification

We address the problem of multi-class classification in the case where the number of classes is very large. We propose a double sampling strategy on top of a multi-class to binary reduction strategy, which transforms the original multi-class problem into a binary classification problem over pairs o…


 SAM: Semantic Attribute Modulation for Language Modeling and Style Variation

 

This paper presents a Semantic Attribute Modulation (SAM) for language modeling and style variation. The semantic attribute modulation includes various document attributes, such as titles, authors, and document categories. We consider two types of attributes, (title attributes and category attribut…


 Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems

  

Neural models have become ubiquitous in automatic speech recognition systems. While neural networks are typically used as acoustic models in more complex systems, recent studies have explored end-to-end speech recognition systems based on neural networks, which can be trained to directly predict te…


 Co-training for Demographic Classification Using Deep Learning from Label Proportions

   

Deep learning algorithms have recently produced state-of-the-art accuracy in many classification tasks, but this success is typically dependent on access to many annotated training examples. For domains without such data, an attractive alternative is to train models with light, or distant supervisi…


 Adding Context to Concept Trees

Concept Trees are a type of database that can organise arbitrary textual information using a very simple rule. Each tree tries to represent a single cohesive concept and the trees can link with each other for navigation and semantic purposes. The trees are therefore a type of semantic network and w…


 Deconvolutional Paragraph Representation Learning

   

Learning latent representations from long text sequences is an important first step in many natural language processing applications. Recurrent Neural Networks (RNNs) have become a cornerstone for this challenging task. However, the quality of sentences during RNN-based decoding (reconstruction) de…


 A Deep Reinforcement Learning Chatbot

   

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ense…


 Learning to Compose Domain-Specific Transformations for Data Augmentation

  

Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels. While it is often easy for domain experts to specify individual transformations, constructing and tuning the more sophisticated c…


 Enriching Linked Datasets with New Object Properties

 

Although several RDF knowledge bases are available through the LOD initiative, the ontology schema of such linked datasets is not very rich. In particular, they lack object properties. The problem of finding new object properties (and their instances) between any two given classes has not been inve…


 CSI: A Hybrid Deep Model for Fake News Detection

  

The topic of fake news has drawn attention both from the public and the academic communities. Such misinformation has the potential of affecting public opinion, providing an opportunity for malicious parties to manipulate the outcomes of public events such as elections. Because such high stakes are…


 Deep Reinforcement Learning: An Overview

      

We give an overview of recent exciting achievements of deep reinforcement learning (RL). We discuss six core elements, six important mechanisms, and twelve applications. We start with background of machine learning, deep learning and reinforcement learning. Next we discuss core RL elements, includi…


 Modelling Protagonist Goals and Desires in First-Person Narrative

   

Many genres of natural language text are narratively structured, a testament to our predilection for organizing our experiences as narratives. There is broad consensus that understanding a narrative requires identifying and tracking the goals and desires of the characters and their narrative outcom…


 Learning Distributed Representations of Texts and Entities from Knowledge Base

We describe a neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities. Given a text in the KB, we train our proposed model to predict entities that are relevant to the text. Our model is designed to be generic with the ability to address variou…