The problem of finding dense components of a graph is a widely explored area in data analysis, with diverse applications in fields and branches of study including community mining, spam detection, computer security and bioinformatics. This research project explores previously available algorithms in order to study them and identify potential modifications that could result in an improved version with considerable performance and efficiency leap. Furthermore, efforts were also steered towards devising a novel algorithm for the problem of densest subgraph discovery. This paper presents an improved implementation of a widely used densest subgraph discovery algorithm and a novel parallel algorithm which produces better results than a 2-approximation.

相关内容

Explanation：生物信息学。 Publisher：Oxford University Press。 SIT： http://dblp.uni-trier.de/db/journals/bioinformatics/

Many distributed optimization algorithms achieve existentially-optimal running times, meaning that there exists some pathological worst-case topology on which no algorithm can do better. Still, most networks of interest allow for exponentially faster algorithms. This motivates two questions: (1) What network topology parameters determine the complexity of distributed optimization? (2) Are there universally-optimal algorithms that are as fast as possible on every topology? We resolve these 25-year-old open problems in the known-topology setting (i.e., supported CONGEST) for a wide class of global network optimization problems including MST, $(1+\varepsilon)$-min cut, various approximate shortest paths problems, sub-graph connectivity, etc. In particular, we provide several (equivalent) graph parameters and show they are tight universal lower bounds for the above problems, fully characterizing their inherent complexity. Our results also imply that algorithms based on the low-congestion shortcut framework match the above lower bound, making them universally optimal if shortcuts are efficiently approximable. We leverage a recent result in hop-constrained oblivious routing to show this is the case if the topology is known -- giving universally-optimal algorithms for all above problems.

We consider the problem of assigning or allocating resources to a set of jobs. We consider the case when the resources are fungible, that is, the job can be done with any mix of the resources, but with different efficiencies. In our formulation we maximize a total utility subject to a given limit on the resource usage, which is a convex optimization problem and so is tractable. In this paper we develop a custom, parallelizable algorithm for solving the resource allocation problem that scales to large problems, with millions of jobs. Our algorithm is based on the dual problem, in which the dual variables associated with the resource usage limit can be interpreted as resource prices. Our method updates the resource prices in each iteration, ultimately discovering the optimal resource prices, from which an optimal allocation is obtained. We provide an open-source implementation of our method, which can solve problems with millions of jobs in a few seconds on CPU, and under a second on a GPU; our software can solve smaller problems in milliseconds. On large problems, our implementation is up to three orders of magnitude faster than a commerical solver for convex optimization.

Graph neural networks (GNN) have achieved state-of-the-art performance on various industrial tasks. However, the poor efficiency of GNN inference and frequent Out-Of-Memory (OOM) problem limit the successful application of GNN on edge computing platforms. To tackle these problems, a feature decomposition approach is proposed for memory efficiency optimization of GNN inference. The proposed approach could achieve outstanding optimization on various GNN models, covering a wide range of datasets, which speeds up the inference by up to 3x. Furthermore, the proposed feature decomposition could significantly reduce the peak memory usage (up to 5x in memory efficiency improvement) and mitigate OOM problems during GNN inference.

In the steady-state contingency analysis, the traditional Newton-Raphson method suffers from non-convergence issues when solving post-outage power flow problems, which hinders the integrity and accuracy of security assessment. In this paper, we propose a novel robust contingency analysis approach based on holomorphic embedding (HE). The HE-based simulator guarantees convergence if the true power flow solution exists, which is desirable because it avoids the influence of numerical issues and provides a credible security assessment conclusion. In addition, based on the multi-area characteristics of real-world power systems, a partitioned HE (PHE) method is proposed with an interface-based partitioning of HE formulation. The PHE method does not undermine the numerical robustness of HE and significantly reduces the computation burden in large-scale contingency analysis. The PHE method is further enhanced by parallel or distributed computation to become parallel PHE (P${}^\mathrm{2}$HE). Tests on a 458-bus system, a synthetic 419-bus system and a large-scale 21447-bus system demonstrate the advantages of the proposed methods in robustness and efficiency.

This paper presents a parallel adaptive clustering (PAC) algorithm to automatically classify data while simultaneously choosing a suitable number of classes. Clustering is an important tool for data analysis and understanding in a broad set of areas including data reduction, pattern analysis, and classification. However, the requirement to specify the number of clusters in advance and the computational burden associated with clustering large sets of data persist as challenges in clustering. We propose a new parallel adaptive clustering (PAC) algorithm that addresses these challenges by adaptively computing the number of clusters and leveraging the power of parallel computing. The algorithm clusters disjoint subsets of the data on parallel computation threads. We develop regularized set \mi{k}-means to efficiently cluster the results from the parallel threads. A refinement step further improves the clusters. The PAC algorithm offers the capability to adaptively cluster data sets which change over time by reusing the information from previous time steps to decrease computation. We provide theoretical analysis and numerical experiments to characterize the performance of the method, validate its properties, and demonstrate the computational efficiency of the method.

Although deep neural networks are successful for many tasks in the speech domain, the high computational and memory costs of deep neural networks make it difficult to directly deploy highperformance Neural Network systems on low-resource embedded devices. There are several mechanisms to reduce the size of the neural networks i.e. parameter pruning, parameter quantization, etc. This paper focuses on how to apply binary neural networks to the task of speaker verification. The proposed binarization of training parameters can largely maintain the performance while significantly reducing storage space requirements and computational costs. Experiment results show that, after binarizing the Convolutional Neural Network, the ResNet34-based network achieves an EER of around 5% on the Voxceleb1 testing dataset and even outperforms the traditional real number network on the text-dependent dataset: Xiaole while having a 32x memory saving.

Convolutional Neural Networks experience catastrophic forgetting when optimized on a sequence of learning problems: as they meet the objective of the current training examples, their performance on previous tasks drops drastically. In this work, we introduce a novel framework to tackle this problem with conditional computation. We equip each convolutional layer with task-specific gating modules, selecting which filters to apply on the given input. This way, we achieve two appealing properties. Firstly, the execution patterns of the gates allow to identify and protect important filters, ensuring no loss in the performance of the model for previously learned tasks. Secondly, by using a sparsity objective, we can promote the selection of a limited set of kernels, allowing to retain sufficient model capacity to digest new tasks.Existing solutions require, at test time, awareness of the task to which each example belongs to. This knowledge, however, may not be available in many practical scenarios. Therefore, we additionally introduce a task classifier that predicts the task label of each example, to deal with settings in which a task oracle is not available. We validate our proposal on four continual learning datasets. Results show that our model consistently outperforms existing methods both in the presence and the absence of a task oracle. Notably, on Split SVHN and Imagenet-50 datasets, our model yields up to 23.98% and 17.42% improvement in accuracy w.r.t. competing methods.

Deep convolutional neural networks (CNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance. During the past few years, tremendous progress has been made in this area. In this paper, we survey the recent advanced techniques for compacting and accelerating CNNs model developed. These techniques are roughly categorized into four schemes: parameter pruning and sharing, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation. Methods of parameter pruning and sharing will be described at the beginning, after that the other techniques will be introduced. For each scheme, we provide insightful analysis regarding the performance, related applications, advantages, and drawbacks etc. Then we will go through a few very recent additional successful methods, for example, dynamic capacity networks and stochastic depths networks. After that, we survey the evaluation matrix, the main datasets used for evaluating the model performance and recent benchmarking efforts. Finally, we conclude this paper, discuss remaining challenges and possible directions on this topic.

The availability of large microarray data has led to a growing interest in biclustering methods in the past decade. Several algorithms have been proposed to identify subsets of genes and conditions according to different similarity measures and under varying constraints. In this paper we focus on the exclusive row biclustering problem for gene expression data sets, in which each row can only be a member of a single bicluster while columns can participate in multiple ones. This type of biclustering may be adequate, for example, for clustering groups of cancer patients where each patient (row) is expected to be carrying only a single type of cancer, while each cancer type is associated with multiple (and possibly overlapping) genes (columns). We present a novel method to identify these exclusive row biclusters through a combination of existing biclustering algorithms and combinatorial auction techniques. We devise an approach for tuning the threshold for our algorithm based on comparison to a null model in the spirit of the Gap statistic approach. We demonstrate our approach on both synthetic and real-world gene expression data and show its power in identifying large span non-overlapping rows sub matrices, while considering their unique nature. The Gap statistic approach succeeds in identifying appropriate thresholds in all our examples.

Being intensively studied, visual object tracking has witnessed great advances in either speed (e.g., with correlation filters) or accuracy (e.g., with deep features). Real-time and high accuracy tracking algorithms, however, remain scarce. In this paper we study the problem from a new perspective and present a novel parallel tracking and verifying (PTAV) framework, by taking advantage of the ubiquity of multi-thread techniques and borrowing ideas from the success of parallel tracking and mapping in visual SLAM. The proposed PTAV framework is typically composed of two components, a (base) tracker T and a verifier V, working in parallel on two separate threads. The tracker T aims to provide a super real-time tracking inference and is expected to perform well most of the time; by contrast, the verifier V validates the tracking results and corrects T when needed. The key innovation is that, V does not work on every frame but only upon the requests from T; on the other end, T may adjust the tracking according to the feedback from V. With such collaboration, PTAV enjoys both the high efficiency provided by T and the strong discriminative power by V. Meanwhile, to adapt V to object appearance changes over time, we maintain a dynamic target template pool for adaptive verification, resulting in further performance improvements. In our extensive experiments on popular benchmarks including OTB2015, TC128, UAV20L and VOT2016, PTAV achieves the best tracking accuracy among all real-time trackers, and in fact even outperforms many deep learning based algorithms. Moreover, as a general framework, PTAV is very flexible with great potentials for future improvement and generalization.

Bernhard Haeupler,David Wajc,Goran Zuzic
0+阅读 · 4月8日
Akshay Agrawal,Stephen Boyd,Deepak Narayanan,Fiodar Kazhamiaka,Matei Zaharia
0+阅读 · 4月7日
Ao Zhou,Jianlei Yang,Yeqi Gao,Tong Qiao,Yingjie Qi,Xiaoyi Wang,Yunli Chen,Pengcheng Dai,Weisheng Zhao,Chunming Hu
0+阅读 · 4月7日
Benjamin McLaughlin,Sung Ha Kang
0+阅读 · 4月6日
Tinglong Zhu,Xiaoyi Qin,Ming Li
0+阅读 · 4月6日
Davide Abati,Jakub Tomczak,Tijmen Blankevoort,Simone Calderara,Rita Cucchiara,Babak Ehteshami Bejnordi
5+阅读 · 2020年3月31日
Yu Cheng,Duo Wang,Pan Zhou,Tao Zhang
35+阅读 · 2019年9月8日
Heng Fan,Haibin Ling
7+阅读 · 2018年1月30日

37+阅读 · 2020年11月3日

35+阅读 · 2020年7月26日

43+阅读 · 2020年5月22日

45+阅读 · 2019年10月12日

12+阅读 · 2020年1月4日
LibRec智能推荐
6+阅读 · 2019年9月19日
CreateAMind
9+阅读 · 2019年5月22日

5+阅读 · 2019年3月22日
CreateAMind
6+阅读 · 2019年1月7日
CreateAMind
26+阅读 · 2019年1月3日
CreateAMind
3+阅读 · 2018年4月15日

22+阅读 · 2017年11月16日

16+阅读 · 2017年10月4日
CreateAMind
5+阅读 · 2017年8月4日
Top