To operate safely, an automated vehicle (AV) must anticipate how the environment around it will evolve. For that purpose, it is important to know which prediction models are most appropriate for every situation. Currently, assessment of prediction models is often performed over a set of trajectories without distinction of the type of movement they capture, resulting in the inability to determine the suitability of each model for different situations. In this work we illustrate how standardized evaluation methods result in wrong conclusions regarding a model's predictive capabilities, preventing a clear assessment of prediction models and potentially leading to dangerous on-road situations. We argue that following evaluation practices in safety assessment for AVs, assessment of prediction models should be performed in a scenario-based fashion. To encourage scenario-based assessment of prediction models and illustrate the dangers of improper assessment, we categorize trajectories of the Waymo Open Motion dataset according to the type of movement they capture. Next, three different models are thoroughly evaluated for different trajectory types and prediction horizons. Results show that common evaluation methods are insufficient and the assessment should be performed depending on the application in which the model will operate.
翻译:为了安全操作,自动飞行器必须预测其周围环境将如何演变。为此目的,必须知道哪些预测模型最适合每一种情况。目前,对预测模型的评估往往是在一套轨道上进行的,而没有区分其所捕捉的移动类型,因此无法确定每种模型是否适合不同情况。在这项工作中,我们说明标准化评价方法如何导致对模型的预测能力得出错误的结论,妨碍对预测模型进行明确的评估,并可能导致危险的地面局势。我们主张,遵循AV安全评估中的评价做法,对预测模型的评估应当以假设情况为基础进行。为了鼓励对预测模型进行基于情景的评估,并说明不适当的评估的危险性,我们根据所捕捉的移动类型,对Waymo Open Motion数据集的轨迹进行了分类。接下来,对三种不同的模型进行了不同的轨迹类型和预测前景的彻底评估。结果显示,共同评价方法不够充分,评估应当根据模型运行的应用程序进行。