利用妇女健康跟踪移动应用系统提供的大规模数据预测怀孕情况 (Predicting pregnancy using large-scale data from a women's health tracking mobile application)

from arxiv, Accepted at WWW 2019 (Health on the Web short paper track); an earlier version of this paper was presented at the 2018 NeurIPS ML4H Workshop

Predicting pregnancy has been a fundamental problem in women's health for more than 50 years. Previous datasets have been collected via carefully curated medical studies, but the recent growth of women's health tracking mobile apps offers potential for reaching a much broader population. However, the feasibility of predicting pregnancy from mobile health tracking data is unclear. Here we develop four models -- a logistic regression model, and 3 LSTM models -- to predict a woman's probability of becoming pregnant using data from a women's health tracking app, Clue by BioWink GmbH. Evaluating our models on a dataset of 79 million logs from 65,276 women with ground truth pregnancy test data, we show that our predicted pregnancy probabilities meaningfully stratify women: women in the top 10% of predicted probabilities have a 89% chance of becoming pregnant over 6 menstrual cycles, as compared to a 27% chance for women in the bottom 10%. We develop a technique for extracting interpretable time trends from our deep learning models, and show these trends are consistent with previous fertility research. Our findings illustrate the potential that women's health tracking data offers for predicting pregnancy on a broader population; we conclude by discussing the steps needed to fulfill this potential.

翻译：50多年来,预测怀孕是妇女健康的一个根本问题。以前的数据集是通过仔细整理的医疗研究收集的,但最近妇女健康追踪移动应用程序的增长为覆盖更广泛的人口提供了可能性。然而,从移动健康跟踪数据预测怀孕的可行性尚不清楚。我们开发了四个模型 -- -- 后勤回归模型和3个LSTM模型 -- -- 以利用妇女健康跟踪应用程序BioWink GmbH的Clue数据预测妇女怀孕的可能性。我们评估了65 276个有地面真实怀孕测试数据的妇女7 900万日志的数据集模型。我们显示,我们预测的怀孕概率可能明显地限制妇女:在预测的概率最高的10%的妇女有89%的怀孕机会超过6个月经期,而最底层妇女有27%的怀孕机会。我们开发了一种技术,从我们深层次学习模型中提取可解释的时间趋势,并显示这些趋势与以前的生育力研究相一致。我们的调查结果表明,妇女的健康跟踪数据有可能通过更广泛的步骤来预测怀孕的可能性。我们通过讨论孕期的可能性。

相关内容

MoDELS

关注 30

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/