As wearable sensing becomes increasingly pervasive, a key challenge remains: how can we generate natural language summaries from raw physiological signals such as actigraphy - minute-level movement data collected via accelerometers? In this work, we introduce MotionTeller, a generative framework that natively integrates minute-level wearable activity data with large language models (LLMs). MotionTeller combines a pretrained actigraphy encoder with a lightweight projection module that maps behavioral embeddings into the token space of a frozen decoder-only LLM, enabling free-text, autoregressive generation of daily behavioral summaries. We construct a novel dataset of 54383 (actigraphy, text) pairs derived from real-world NHANES recordings, and train the model using cross-entropy loss with supervision only on the language tokens. MotionTeller achieves high semantic fidelity (BERTScore-F1 = 0.924) and lexical accuracy (ROUGE-1 = 0.722), outperforming prompt-based baselines by 7 percent in ROUGE-1. The average training loss converges to 0.38 by epoch 15, indicating stable optimization. Qualitative analysis confirms that MotionTeller captures circadian structure and behavioral transitions, while PCA plots reveal enhanced cluster alignment in embedding space post-training. Together, these results position MotionTeller as a scalable, interpretable system for transforming wearable sensor data into fluent, human-centered descriptions, introducing new pathways for behavioral monitoring, clinical review, and personalized health interventions.
翻译:随着可穿戴传感技术日益普及,一个关键挑战依然存在:如何从原始生理信号(例如通过加速度计采集的分钟级运动数据——活动记录数据)生成自然语言摘要?本研究提出MotionTeller,一种将分钟级可穿戴活动数据与大型语言模型(LLMs)原生集成的生成框架。MotionTeller将预训练的活动记录编码器与轻量级投影模块相结合,该模块将行为嵌入映射至冻结的仅解码器LLM的词元空间,从而实现日常行为摘要的自由文本自回归生成。我们基于真实世界NHANES记录构建了包含54383个(活动记录,文本)对的新型数据集,并采用交叉熵损失函数,仅对语言词元进行监督训练。MotionTeller实现了高语义保真度(BERTScore-F1 = 0.924)与词汇准确性(ROUGE-1 = 0.722),其ROUGE-1指标较基于提示的基线方法提升7%。平均训练损失在第15个周期收敛至0.38,表明优化过程稳定。定性分析证实MotionTeller能够捕捉昼夜节律结构与行为转换,而PCA可视化显示训练后嵌入空间中的聚类对齐性得到增强。综上,这些成果使MotionTeller成为一个可扩展、可解释的系统,能够将可穿戴传感器数据转化为流畅的、以人为中心的描述,为行为监测、临床评估和个性化健康干预开辟了新途径。