Recurrent neural architectures such as LSTM and GRU remain widely used in sequence modeling, but they continue to face two core limitations: redundant gate-specific parameters and reduced ability to retain information across long temporal distances. This paper introduces the Quantum-Leap LSTM (QL-LSTM), a recurrent architecture designed to address both challenges through two independent components. The Parameter-Shared Unified Gating mechanism replaces all gate-specific transformations with a single shared weight matrix, reducing parameters by approximately 48 percent while preserving full gating behavior. The Hierarchical Gated Recurrence with Additive Skip Connections component adds a multiplication-free pathway that improves long-range information flow and reduces forget-gate degradation. We evaluate QL-LSTM on sentiment classification using the IMDB dataset with extended document lengths, comparing it to LSTM, GRU, and BiLSTM reference models. QL-LSTM achieves competitive accuracy while using substantially fewer parameters. Although the PSUG and HGR-ASC components are more efficient per time step, the current prototype remains limited by the inherent sequential nature of recurrent models and therefore does not yet yield wall-clock speed improvements without further kernel-level optimization.
翻译:循环神经架构(如LSTM和GRU)在序列建模中仍被广泛使用,但它们持续面临两个核心限制:冗余的门控专用参数以及跨长时距保留信息的能力下降。本文介绍了量子飞跃LSTM(QL-LSTM),这是一种通过两个独立组件应对上述挑战的循环架构。参数共享统一门控机制以单个共享权重矩阵取代所有门控专用变换,在保持完整门控行为的同时将参数减少约48%。分层门控循环与加法跳跃连接组件添加了一条无乘法通路,改善了长程信息流并减少了遗忘门退化。我们在IMDB数据集上使用扩展文档长度评估QL-LSTM的情感分类性能,并与LSTM、GRU和BiLSTM参考模型进行比较。QL-LSTM在使用显著更少参数的情况下实现了具有竞争力的准确率。尽管PSUG和HGR-ASC组件在每时间步上更高效,但当前原型仍受限于循环模型固有的顺序特性,因此若未进行进一步内核级优化,尚无法实现实际运行速度的提升。