从视频中预测 3D 人类动态 (Predicting 3D Human Dynamics from Video) - 专知论文

会员服务 ·

0

3D · MoDELS · surge · 语言模型化 · 真实值 ·

2019 年 8 月 20 日

Predicting 3D Human Dynamics from Video

翻译：从视频中预测 3D 人类动态

Jason Y. Zhang,Panna Felsen,Angjoo Kanazawa,Jitendra Malik

from arxiv, To Appear in ICCV 2019. (v2: Updated "3D Pose from Video" in Related Work.)

Given a video of a person in action, we can easily guess the 3D future motion of the person. In this work, we present perhaps the first approach for predicting a future 3D mesh model sequence of a person from past video input. We do this for periodic motions such as walking and also actions like bowling and squatting seen in sports or workout videos. While there has been a surge of future prediction problems in computer vision, most approaches predict 3D future from 3D past or 2D future from 2D past inputs. In this work, we focus on the problem of predicting 3D future motion from past image sequences, which has a plethora of practical applications in autonomous systems that must operate safely around people from visual inputs. Inspired by the success of autoregressive models in language modeling tasks, we learn an intermediate latent space on which we predict the future. This effectively facilitates autoregressive predictions when the input differs from the output domain. Our approach can be trained on video sequences obtained in-the-wild without 3D ground truth labels. The project website with videos can be found at https://jasonyzhang.com/phd.

翻译：根据行动的一个人的视频,我们可以很容易地猜到一个人的3D未来运动。在这项工作中,我们也许提出第一个预测未来3D网目模型序列的方法,从过去的视频输入中预测一个人未来的3D网目模型序列。我们这样做是为了进行定期运动,例如行走和在体育或健身录像中看到保龄球和蹲下等行动。虽然在计算机视觉中出现了未来的预测问题激增,但大多数方法预测3D未来是从过去3D或2D未来,从过去2D输入中预测。在这项工作中,我们侧重于预测过去图像序列中的3D未来运动的问题,这在自主系统中有大量的实际应用,必须安全地在视觉输入的人周围操作。在自动递增模型任务的成功激励下,我们学习了一种中间的潜在空间,我们据此预测未来。当输入与输出域不同时,这有效地促进了自动递增预测。我们的方法可以就在没有3D地面标签的情况下在网上获得的视频序列进行培训。项目网站可在https://jasonzhang.com/phd上找到带视频。

1

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

30+阅读 · 2020年5月12日

3D目标检测进展综述

3D目标检测进展综述

专知会员服务

186+阅读 · 2020年4月24日

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

专知会员服务

133+阅读 · 2020年3月8日

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

专知会员服务

25+阅读 · 2020年1月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

77+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning

Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning

Arxiv

3+阅读 · 2020年3月17日

Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning

Arxiv

6+阅读 · 2019年2月27日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Learning Blind Video Temporal Consistency

Learning Blind Video Temporal Consistency

Arxiv

3+阅读 · 2018年8月1日

Predicting Visual Features from Text for Image and Video Caption Retrieval

Arxiv

5+阅读 · 2018年7月14日

Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present

Arxiv

6+阅读 · 2018年4月7日

IQA: Visual Question Answering in Interactive Environments

Arxiv

5+阅读 · 2018年4月5日

A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

Arxiv

5+阅读 · 2018年3月29日

Multiple Object Detection, Tracking and Long-Term Dynamics Learning in Large 3D Maps

Arxiv

6+阅读 · 2018年1月28日

Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

Arxiv

8+阅读 · 2017年11月22日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

30+阅读 · 2020年5月12日

3D目标检测进展综述

3D目标检测进展综述

专知会员服务

186+阅读 · 2020年4月24日

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

专知会员服务

133+阅读 · 2020年3月8日

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

专知会员服务

25+阅读 · 2020年1月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

77+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

热门VIP内容

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning

Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning

Arxiv

3+阅读 · 2020年3月17日

Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning

Arxiv

6+阅读 · 2019年2月27日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Learning Blind Video Temporal Consistency

Learning Blind Video Temporal Consistency

Arxiv

3+阅读 · 2018年8月1日

Predicting Visual Features from Text for Image and Video Caption Retrieval

Arxiv

5+阅读 · 2018年7月14日

Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present

Arxiv

6+阅读 · 2018年4月7日

IQA: Visual Question Answering in Interactive Environments

Arxiv

5+阅读 · 2018年4月5日

A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

Arxiv

5+阅读 · 2018年3月29日

Multiple Object Detection, Tracking and Long-Term Dynamics Learning in Large 3D Maps

Arxiv

6+阅读 · 2018年1月28日

Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

Arxiv

8+阅读 · 2017年11月22日

微信扫码咨询专知VIP会员