#### Informational Neurobayesian Approach to Neural Networks Training. Opportunities and Prospects

A study of the classification problem in context of information theory is presented in the paper. Current research in that field is focused on optimisation and bayesian approach. Although that gives satisfying results, they require a vast amount of data and computations to train on. Authors propose…

#### Quantum-assisted learning of hardware-embedded probabilistic graphical models

Mainstream machine learning techniques such as deep learning and probabilistic programming rely heavily on sampling from generally intractable probability distributions. There is increasing interest in the potential advantages of using quantum computing technologies as sampling engines to speed up …

#### Recent Advances in Convolutional Neural Networks

Convolutional Neural Network DNN image language speech

In the last few years, deep learning has led to very good performance on a variety of problems, such as visual recognition, speech recognition and natural language processing. Among different types of deep neural networks, convolutional neural networks have been most extensively studied. Leveraging…

#### Deep Extreme Multi-label Learning

Extreme multi-label learning (XML) or classification has been a practical and important problem since the boom of big data. The main challenge lies in the exponential label space which involves $2^L$ possible label sets when the label dimension $L$ is very large, e.g., in millions for Wikipedia lab…

#### Compositional Nonparametric Prediction: Statistical Efficiency and Greedy Regression Algorithm

In this paper, we propose a compositional nonparametric method in which a model is expressed as a labeled binary tree of $2k+1$ nodes, where each node is either a summation, a multiplication, or the application of one of the $q$ basis functions to one of the $p$ covariates. We show that in order to…

#### Clustering with Missing Features: A Penalized Dissimilarity Measure based approach

Many real-world clustering problems are plagued by incomplete data characterized by missing or absent features for some or all of the data instances. Traditional clustering methods cannot be directly applied to such data without preprocessing by imputation or marginalization techniques. In this art…

#### Binary Classification from Positive-Confidence Data

Reducing labeling costs in supervised learning is a critical issue in many practical machine learning applications. In this paper, we consider positive-confidence (Pconf) classification, the problem of training a binary classifier only from positive data equipped with confidence. Pconf classificati…

#### Swift Linked Data Miner: Mining OWL 2 EL class expressions directly from online RDF datasets

In this study, we present Swift Linked Data Miner, an interruptible algorithm that can directly mine an online Linked Data source (e.g., a SPARQL endpoint) for OWL 2 EL class expressions to extend an ontology with new SubClassOf: axioms. The algorithm works by downloading only a small part of the L…

#### Meta-Learning via Feature-Label Memory Network

Deep learning typically requires training a very capable architecture using large datasets. However, many important learning problems demand an ability to draw valid inferences from small size datasets, and such problems pose a particular challenge for deep learning. In this regard, various researc…

#### Decision Trees for Helpdesk Advisor Graphs

We use decision trees to build a helpdesk agent reference network to facilitate the on-the-job advising of junior or less experienced staff on how to better address telecommunication customer fault reports. Such reports generate field measurements and remote measurements which, when coupled with lo…

#### Protein Folding Optimization using Differential Evolution Extended with Local Search and Component Reinitialization

This paper presents a novel differential evolution algorithm for protein folding optimization that is applied to a three-dimensional AB off-lattice model. The proposed algorithm includes two new mechanisms. A local search is used to improve convergence speed and to reduce the runtime complexity of …

#### ProLanGO: Protein Function Prediction Using Neural~Machine Translation Based on a Recurrent Neural Network

With the development of next generation sequencing techniques, it is fast and cheap to determine protein sequences but relatively slow and expensive to extract useful information from protein sequences because of limitations of traditional biological experimental techniques. Protein function predic…

#### Ricci Curvature and the Manifold Learning Problem

Consider a sample of $n$ points taken i.i.d from a submanifold $Sigma$ of Euclidean space. We show that there is a way to estimate the Ricci curvature of $Sigma$ with respect to the induced metric from the sample. Our method is grounded in the notions of emph{Carr'{e} du Champ} for diffusion semi-g…

#### Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data

One of the defining properties of deep learning is that models are chosen to have many more parameters than available training data. In light of this capacity for overfitting, it is remarkable that simple algorithms like SGD reliably return solutions with low test error. One roadblock to explaining…

#### Consequentialist conditional cooperation in social dilemmas with imperfect information

Social dilemmas, where mutual cooperation can lead to high payoffs but participants face incentives to cheat, are ubiquitous in multi-agent interaction. We wish to construct agents that cooperate with pure cooperators, avoid exploitation by pure defectors, and incentivize cooperation from the rest.…

#### Learning Differentially Private Language Models Without Losing Accuracy

We demonstrate that it is possible to train large recurrent language models with user-level differential privacy guarantees without sacrificing predictive accuracy. Our work builds on recent advances in the training of deep networks on user-partitioned data and privacy accounting for stochastic gra…

#### Asynchronous Decentralized Parallel Stochastic Gradient Descent

Recent work shows that decentralized parallel stochastic gradient decent (D-PSGD) can outperform its centralized counterpart both theoretically and practically. While asynchronous parallelism is a powerful technology to improve the efficiency of parallelism in distributed machine learning platforms…

#### First-Person Perceptual Guidance Behavior Decomposition using Active Constraint Classification

Humans exhibit a wide range of adaptive and robust dynamic motion behavior that is yet unmatched by autonomous control systems. These capabilities are essential for real-time behavior generation in cluttered environments. Recent work suggests that human capabilities rely on task structure learning …

#### Concept Drift Learning with Alternating Learners

Data-driven predictive analytics are in use today across a number of industrial applications, but further integration is hindered by the requirement of similarity among model training and test data distributions. This paper addresses the need of learning from possibly nonstationary data streams, or…

#### Characterization of Gradient Dominance and Regularity Conditions for Neural Networks

The past decade has witnessed a successful application of deep learning to solving many challenging problems in machine learning and artificial intelligence. However, the loss functions of deep neural networks (especially nonlinear networks) are still far from being well understood from a theoretic…

#### Unit Commitment using Nearest Neighbor as a Short-Term Proxy

We devise the Unit Commitment Nearest Neighbor (UCNN) algorithm to be used as a proxy for quickly approximating outcomes of short-term decisions, to make tractable hierarchical long-term assessment and planning for large power systems. Experimental results on an updated versions of IEEE-RTS79 and I…

#### A Bayesian Nonparametric Method for Clustering Imputation, and Forecasting in Multivariate Time Series

This article proposes a Bayesian nonparametric method for forecasting, imputation, and clustering in sparsely observed, multivariate time series. The method is appropriate for jointly modeling hundreds of time series with widely varying, non-stationary dynamics. Given a collection of $N$ time serie…

#### Graph Embedding with Rich Information through Bipartite Heterogeneous Network

Graph embedding has attracted increasing attention due to its critical application in social network analysis. Most existing algorithms for graph embedding only rely on the typology information and fail to use the copious information in nodes as well as edges. As a result, their performance for man…

#### The Origins of Computational Mechanics: A Brief Intellectual History and Several Clarifications

The principle goal of computational mechanics is to define pattern and structure so that the organization of complex systems can be detected and quantified. Computational mechanics developed from efforts in the 1970s and early 1980s to identify strange attractors as the mechanism driving weak fluid…

#### Photo-Guided Exploration of Volume Data Features

In this work, we pose the question of whether, by considering qualitative information such as a sample target image as input, one can produce a rendered image of scientific data that is similar to the target. The algorithm resulting from our research allows one to ask the question of whether featur…