REWA introduces a general theory of similarity based on witness-overlap structures. We show that whenever similarity between concepts can be expressed as monotone witness overlap -- whether arising from graph neighborhoods, causal relations, temporal structure, topological features, symbolic patterns, or embedding-based neighborhoods -- it admits a reduction to compact encodings with provable ranking preservation guarantees. REWA systems consist of: (1) finite witness sets $W(v)$, (2) semi-random bit assignments generated from each witness, and (3) monotonicity of expected similarity in the overlap $Δ(u, v) = |W(u) \cap W(v)|$. We prove that under an overlap-gap condition on the final witness sets -- independent of how they were constructed -- top-$k$ rankings are preserved using $m = O(\log(|V|/δ))$ bits. The witness-set formulation is compositional: any sequence of structural, temporal, causal, topological, information-theoretic, or learned transformations can be combined into pipelines that terminate in discrete witness sets. The theory applies to the final witness overlap, enabling modular construction of similarity systems from reusable primitives. This yields a vast design space: millions of composable similarity definitions inherit logarithmic encoding complexity. REWA subsumes and unifies Bloom filters, minhash, LSH bitmaps, random projections, sketches, and hierarchical filters as special cases. It provides a principled foundation for similarity systems whose behavior is governed by witness overlap rather than hash-function engineering. This manuscript presents the axioms, the main reducibility theorem, complete proofs with explicit constants, and a detailed discussion of compositional design, limitations, and future extensions including multi-bit encodings, weighted witnesses, and non-set representations.
翻译:REWA提出了一种基于见证重叠结构的通用相似性理论。我们证明,只要概念间的相似性可以表示为单调见证重叠——无论其源于图邻域、因果关系、时序结构、拓扑特征、符号模式还是基于嵌入的邻域——它都能被约简为具有可证明排序保持保证的紧凑编码。REWA系统包含:(1)有限见证集$W(v)$,(2)由每个见证生成的半随机比特分配,以及(3)期望相似性在重叠度$Δ(u, v) = |W(u) \cap W(v)|$中的单调性。我们证明,在最终见证集满足重叠间隙条件(与其构建方式无关)时,使用$m = O(\log(|V|/δ))$比特即可保持前$k$位排序。见证集表述具有组合性:任何结构、时序、因果、拓扑、信息论或学习转换的序列均可组合为以离散见证集为终端的处理流程。该理论适用于最终见证重叠,使得相似性系统可从可复用原语模块化构建。这产生了一个广阔的设计空间:数百万种可组合的相似性定义继承了对数编码复杂度。REWA将布隆过滤器、最小哈希、LSH位图、随机投影、草图及分层过滤器等特例归纳并统一。它为行为由见证重叠而非哈希函数工程主导的相似性系统提供了原则性基础。本文阐述了公理体系、主要可约简性定理、含显式常数的完整证明,并对组合设计、局限性及未来扩展(包括多比特编码、加权见证和非集合表示)进行了详细讨论。