IEEE国际数据挖掘会议(ICDM)是世界上首屈一指的数据挖掘研究会议。它提供了一个国际论坛，介绍原始的研究成果，以及交流和传播创新和实际的发展经验。会议涵盖了数据挖掘的所有方面，包括算法、软件、系统和应用程序。ICDM吸引了来自统计、机器学习、模式识别、数据库、数据仓库、数据可视化、基于知识的系统和高性能计算等数据挖掘相关领域的研究人员、应用程序开发人员和实践者。会议通过促进新颖、高质量的研究成果和解决具有挑战性的数据挖掘问题的创新方案，力求推动数据挖掘领域的最新发展。 官网地址：http://dblp.uni-trier.de/db/conf/icdm/

### 最新论文

Weighted minwise hashing is a standard dimensionality reduction technique with applications to similarity search and large-scale kernel machines. We introduce a simple algorithm that takes a weighted set $x \in \mathbb{R}_{\geq 0}^{d}$ and computes $k$ independent minhashes in expected time $O(k \log k + \Vert x \Vert_{0}\log( \Vert x \Vert_1 + 1/\Vert x \Vert_1))$, improving upon the state-of-the-art BagMinHash algorithm (KDD '18) and representing the fastest weighted minhash algorithm for sparse data. Our experiments show running times that scale better with $k$ and $\Vert x \Vert_0$ compared to ICWS (ICDM '10) and BagMinhash, obtaining $10$x speedups in common use cases. Our approach also gives rise to a technique for computing fully independent locality-sensitive hash values for $(L, K)$-parameterized approximate near neighbor search under weighted Jaccard similarity in optimal expected time $O(LK + \Vert x \Vert_0)$, improving on prior work even in the case of unweighted sets.

