In a distributed information application an encoder compresses an arbitrary vector while a similar reference vector is available to the decoder as side information. For the Hamming-distance similarity measure, and when guaranteed perfect reconstruction is required, we present two contributions to the solution of this problem. One result shows that when a set of potential reference vectors is available to the encoder, lower compression rates can be achieved when the set satisfies a certain clustering property. Another result reduces the best known decoding complexity from exponential in the vector length $n$ to $O(n^{1.5})$ by generalized concatenation of inner coset codes and outer error-correcting codes. One potential application of the results is the compression of DNA sequences, where similar (but not identical) reference vectors are shared among senders and receivers.
翻译:在分布式信息应用中,编码器压缩任意的矢量,而解码器作为侧边信息也可以使用类似的参考矢量。对于Hamming-距离相似度测量,如果需要保证完全重建,我们为解决这一问题提出两项贡献。一个结果显示,当编码器可以获得一套潜在的参考矢量时,当集满足某个组群属性时,可以实现较低的压缩率。另一个结果是,通过对内科代码和外部错误校正代码进行普遍组合,使最已知的解码复杂性从矢量长度的指数值$减到$O(n ⁇ 1.5})$($O)。结果的一个潜在应用是压缩DNA序列,在发送者和接收者之间共享类似的(但并非相同的)矢量。