We present SLATE, a sequence labeling approach for extracting tasks from free-form content such as digitally handwritten (or "inked") notes on a virtual whiteboard. Our approach allows us to create a single, low-latency model to simultaneously perform sentence segmentation and classification of these sentences into task/non-task sentences. SLATE greatly outperforms a baseline two-model (sentence segmentation followed by classification model) approach, achieving a task F1 score of 84.4%, a sentence segmentation (boundary similarity) score of 88.4% and three times lower latency compared to the baseline. Furthermore, we provide insights into tackling challenges of performing NLP on the inking domain. We release both our code and dataset for this novel task.
翻译:我们提出SLATE,这是从虚拟白板上的数字手写(或“嵌入”注释等自由形式内容中抽取任务的顺序标签方法。我们的方法使我们能够创建一个单一的低长模型,同时进行判决分解并将这些判决分类为任务/非任务判决。 SLATE大大优于基准的两种模式(按判决分解,然后是分类模式)方法,达到84.4%的任务F1分,与基线相比,判决分解(边界相似性)分数为88.4%,延长度是基线的三倍。此外,我们提出了如何应对在输入域执行NLP的挑战的见解。我们为这项新任务发布了我们的代码和数据集。