We construct multiperiodic processes -- a simple example of stationary ergodic (but not mixing) processes over natural numbers that enjoy the vanishing entropy rate under a mild condition. Multiperiodic processes are supported on randomly shifted deterministic sequences called multiperiodic sequences, which can be efficiently generated using an algorithm called the Infinite Clock. Under a suitable parameterization, multiperiodic sequences exhibit relative frequencies of particular numbers given by Zipf's law. Exactly in the same setting, the respective multiperiodic processes satisfy an asymptotic power-law growth of block entropy, called Hilberg's law. Hilberg's law is deemed to hold for statistical language models, in particular.
翻译:我们构建了多周期过程——这是一种在自然数上定义的平稳遍历(但非混合)过程的简单示例,其在温和条件下具有消失的熵率。多周期过程以随机平移的确定性序列(称为多周期序列)为支撑集,这类序列可通过一种名为“无限时钟”的算法高效生成。在适当的参数化条件下,多周期序列中特定数字出现的相对频率服从齐普夫定律。在完全相同的设定下,相应的多周期过程满足块熵的渐近幂律增长,称为希尔伯格定律。该定律被认为适用于统计语言模型,尤其如此。