Topic Tag: clustering

home Forums Topic Tag: clustering

 Depression Scale Recognition from Audio, Visual and Text Analysis

      

Depression is a major mental health disorder that is rapidly affecting lives worldwide. Depression not only impacts emotional but also physical and psychological state of the person. Its symptoms include lack of interest in daily activities, feeling low, anxiety, frustration, loss of weight and eve…


 Autoencoder-Driven Weather Clustering for Source Estimation during Nuclear Events

Emergency response applications for nuclear or radiological events can be significantly improved via deep feature learning due to the hidden complexity of the data and models involved. In this paper we present a novel methodology for rapid source estimation during radiological releases based on dee…


 Learning Mixtures of Multi-Output Regression Models by Correlation Clustering for Multi-View Data

In many datasets, different parts of the data may have their own patterns of correlation, a structure that can be modeled as a mixture of local linear correlation models. The task of finding these mixtures is known as correlation clustering. In this work, we propose a linear correlation clustering …


 Supervising Unsupervised Learning

 

We introduce a framework to leverage knowledge acquired from a repository of (heterogeneous) supervised datasets to new unsupervised datasets. Our perspective avoids the subjectivity inherent in unsupervised learning by reducing it to supervised learning, and provides a principled way to evaluate u…


 Subspace Clustering using Ensembles of K-Subspaces

We present a novel approach to the subspace clustering problem that leverages ensembles of the $K$-subspaces (KSS) algorithm via the evidence accumulation clustering framework. Our algorithm forms a co-association matrix whose $(i,j)$th entry is the number of times points $i$ and $j$ are clustered …


 Leveraging Union of Subspace Structure to Improve Constrained Clustering

 

Many clustering problems in computer vision and other contexts are also classification problems, where each cluster shares a meaningful label. Subspace clustering algorithms in particular are often applied to problems that fit this description, for example with face images or handwritten digits. Wh…


 Rapid Near-Neighbor Interaction of High-dimensional Data via Hierarchical Clustering

Calculation of near-neighbor interactions among high dimensional, irregularly distributed data points is a fundamental task to many graph-based or kernel-based machine learning algorithms and applications. Such calculations, involving large, sparse interaction matrices, expose the limitation of con…


 Community Recovery in Hypergraphs

 

Community recovery is a central problem that arises in a wide variety of applications such as network clustering, motion segmentation, face clustering and protein complex detection. The objective of the problem is to cluster data points into distinct communities based on a set of measurements, each…


 Multi-view Graph Embedding with Hub Detection for Brain Network Analysis

 

Multi-view graph embedding has become a widely studied problem in the area of graph learning. Most of the existing works on multi-view graph embedding aim to find a shared common node embedding across all the views of the graph by combining the different views in a specific way. Hub detection, as a…


 Machine Learning Friendly Set Version of Johnson-Lindenstrauss Lemma

In this paper we make a novel use of the Johnson-Lindenstrauss Lemma. The Lemma has an existential form saying that there exists a JL transformation $f$ of the data points into lower dimensional space such that all of them fall into predefined error range $delta$. We formulate in this paper a theor…


 Clustering of Data with Missing Entries using Non-convex Fusion Penalties

The presence of missing entries in data often creates challenges for pattern recognition algorithms. Traditional algorithms for clustering data assume that all the feature values are known for every data point. We propose a method to cluster data in the presence of missing information. Unlike conve…


 Inhomogeneous Hypergraph Clustering with Applications

 

Hypergraph partitioning is an important problem in machine learning, computer vision and network analytics. A widely used method for hypergraph partitioning relies on minimizing a normalized sum of the costs of partitioning hyperedges across clusters. Algorithmic solutions based on this approach as…


 A Statistical Approach to Increase Classification Accuracy in Supervised Learning Algorithms

Probabilistic mixture models have been widely used for different machine learning and pattern recognition tasks such as clustering, dimensionality reduction, and classification. In this paper, we focus on trying to solve the most common challenges related to supervised learning algorithms by using …


 The Advantage of Evidential Attributes in Social Networks

Nowadays, there are many approaches designed for the task of detecting communities in social networks. Among them, some methods only consider the topological graph structure, while others take use of both the graph structure and the node attributes. In real-world networks, there are many uncertain …


 An embedded segmental K-means model for unsupervised segmentation and clustering of speech

  

Unsupervised segmentation and clustering of unlabelled speech are core problems in zero-resource speech processing. Most approaches lie at methodological extremes: some use probabilistic Bayesian models with convergence guarantees, while others opt for more efficient heuristic techniques. Despite c…


 Discriminative Similarity for Clustering and Semi-Supervised Learning

Similarity-based clustering and semi-supervised learning methods separate the data into clusters or classes according to the pairwise similarity between the data, and the pairwise similarity is crucial for their performance. In this paper, we propose a novel discriminative similarity learning frame…


 Graph sketching-based Space-efficient Data Clustering

In this paper, we address the problem of recovering arbitrary-shaped data clusters from datasets while facing high space constraints, as this is for instance the case in the Internet of Things environment when analysis algorithms are directly deployed on resources-limited mobile devices collecting …


 Recovery Conditions and Sampling Strategies for Network Lasso

The network Lasso is a recently proposed convex optimization method for machine learning from massive network structured datasets, i.e., big data over networks. It is a variant of the well-known least absolute shrinkage and selection operator (Lasso), which is underlying many methods in learning an…


 A heterogeneity based iterative clustering approach for obtaining samples with reduced bias

Medical and social sciences demand sampling techniques which are robust, reliable, replicable and give samples with the least bias. Majority of the applications of sampling use randomized sampling, albeit with stratification where applicable to lower the bias. The randomized technique is not consis…


 Clustering Patients with Tensor Decomposition

 

In this paper we present a method for the unsupervised clustering of high-dimensional binary data, with a special focus on electronic healthcare records. We present a robust and efficient heuristic to face this problem using tensor decomposition. We present the reasons why this approach is preferab…