Tens of millions of people live blind, and their number is ever increasing. Visual-to-auditory sensory substitution (SS) encompasses a family of cheap, generic solutions to assist the visually impaired by conveying visual information through sound. The required SS training is lengthy: months of effort is necessary to reach a practical level of adaptation. There are two reasons for the tedious training process: the elongated substituting audio signal, and the disregard for the compressive characteristics of the human hearing system. To overcome these obstacles, we developed a novel class of SS methods, by training deep recurrent autoencoders for image-to-sound conversion. We successfully trained deep learning models on different datasets to execute visual-to-auditory stimulus conversion. By constraining the visual space, we demonstrated the viability of shortened substituting audio signals, while proposing mechanisms, such as the integration of computational hearing models, to optimally convey visual features in the substituting stimulus as perceptually discernible auditory components. We tested our approach in two separate cases. In the first experiment, the author went blindfolded for 5 days, while performing SS training on hand posture discrimination. The second experiment assessed the accuracy of reaching movements towards objects on a table. In both test cases, above-chance-level accuracy was attained after a few hours of training. Our novel SS architecture broadens the horizon of rehabilitation methods engineered for the visually impaired. Further improvements on the proposed model shall yield hastened rehabilitation of the blind and a wider adaptation of SS devices as a consequence.
翻译:成百上千万人失明,而且其数量正在不断增加。视觉到听觉替代(SS)包含一个廉价、通用的解决方案,通过通过声音传递视觉信息帮助视力受损者。所需要的SS培训是漫长的:需要几个月的努力才能达到实际适应水平。培训过程有两个原因:长长的替代听力信号,以及无视人类听力系统的压缩特征。为了克服这些障碍,我们开发了一个新型的SS方法班,为图像到听力转换培训了深度的经常性自动转换器。我们成功培训了不同数据集的深层学习模型,以实施视觉到诊断的刺激转换。所需要的SS培训是漫长的:为了达到一个实际适应水平,我们需要几个月的努力。我们展示了缩短替代听力信号的可行性,同时提出了各种机制,例如将计算听力模型整合起来,以最佳的方式传达替代听力系统的压缩特征。我们用两种不同的例子测试了我们的方法。在第一次实验中,作者被蒙了5天的盲面,同时进行了关于图像到图像到视觉变异的图像转换过程。第二次实验评估精确性测试了我们结构的升级过程。在两小时后,将进入了一个新的水平上,在不断调整的顺序结构结构结构上,将进入了一个新的分析。