Caffe作者贾扬清确认将离开 Facebook,加入阿里硅谷研究院任VP

3 月 3 日 专知

3 月 2 号晚间,知乎上一个题为“如何评价贾扬清离职 Facebook?”的问题热度不断发酵,迅速吸引了不少回答,各方知情人士,纷纷以爆料表达了对大神关注。

同在Facebook任职的晓飞确认了贾扬清离职,并表示Facebook虽是大平台,但对于大牛来说,从来不迷恋大公司光环。


贾扬清离职加入阿里消息属实,不过贾扬清最终的去向并非回国,而是加入阿里硅谷研究院。不过当前贾扬清的 LinkedIn 页面上目前在职公司仍为 Facebook。



作为全球最普遍使用的图像识别开源软件 Caffe 的作者,贾扬清是人工智能深度学习领域里的佼佼者。贾扬清是浙江绍兴人,本科和硕士就读于清华大学,随后在美国加州大学伯克利分校获得计算机科学博士学位。2013 年毕业后,他加入谷歌,是谷歌大脑 TensorFlow 的作者之一。2016 年 2 月从谷歌离职,加入 Facebook,致力于前沿 AI 研究和平台开发。

在Facebook最新的一篇文章中,贾扬清贴出了他参观杭州保俶塔的照片



参考链接:

https://www.thepaper.cn/newsDetail_forward_1599450

https://cloud.tencent.com/developer/article/1142974

https://www.zhihu.com/question/314292977


END-

专 · 知

专知《深度学习:算法到实战》课程全部完成!490+位同学在学习,现在报名,限时优惠!网易云课堂人工智能畅销榜首位!

欢迎微信扫一扫加入专知人工智能知识星球群,获取最新AI专业干货知识教程视频资料和与专家交流咨询!

请加专知小助手微信(扫一扫如下二维码添加),加入专知人工智能主题群,咨询《深度学习:算法到实战》课程,咨询技术商务合作~

请PC登录www.zhuanzhi.ai或者点击阅读原文,注册登录专知,获取更多AI知识资料!

点击“阅读原文”,了解报名专知《深度学习:算法到实战》课程

登录查看更多
点赞 0

We develop a system for modeling hand-object interactions in 3D from RGB images that show a hand which is holding a novel object from a known category. We design a Convolutional Neural Network (CNN) for Hand-held Object Pose and Shape estimation called HOPS-Net and utilize prior work to estimate the hand pose and configuration. We leverage the insight that information about the hand facilitates object pose and shape estimation by incorporating the hand into both training and inference of the object pose and shape as well as the refinement of the estimated pose. The network is trained on a large synthetic dataset of objects in interaction with a human hand. To bridge the gap between real and synthetic images, we employ an image-to-image translation model (Augmented CycleGAN) that generates realistically textured objects given a synthetic rendering. This provides a scalable way of generating annotated data for training HOPS-Net. Our quantitative experiments show that even noisy hand parameters significantly help object pose and shape estimation. The qualitative experiments show results of pose and shape estimation of objects held by a hand "in the wild".

点赞 0
阅读2+

Graph deep learning models, such as graph convolutional networks (GCN) achieve remarkable performance for tasks on graph data. Similar to other types of deep models, graph deep learning models often suffer from adversarial attacks. However, compared with non-graph data, the discrete features, graph connections and different definitions of imperceptible perturbations bring unique challenges and opportunities for the adversarial attacks and defences for graph data. In this paper, we propose both attack and defence techniques. For attack, we show that the discrete feature problem could easily be resolved by introducing integrated gradients which could accurately reflect the effect of perturbing certain features or edges while still benefiting from the parallel computations. For defence, we propose to partially learn the adjacency matrix to integrate the information of distant nodes so that the prediction of a certain target is supported by more global graph information rather than just few neighbour nodes. This, therefore, makes the attacks harder since one need to perturb more features/edges to make the attacks succeed. Our experiments on a number of datasets show the effectiveness of the proposed methods.

点赞 0
阅读2+

This paper studies the problems of vehicle make & model classification. Some of the main challenges are reaching high classification accuracy and reducing the annotation time of the images. To address these problems, we have created a fine-grained database using online vehicle marketplaces of Turkey. A pipeline is proposed to combine an SSD (Single Shot Multibox Detector) model with a CNN (Convolutional Neural Network) model to train on the database. In the pipeline, we first detect the vehicles by following an algorithm which reduces the time for annotation. Then, we feed them into the CNN model. It is reached approximately 4% better classification accuracy result than using a conventional CNN model. Next, we propose to use the detected vehicles as ground truth bounding box (GTBB) of the images and feed them into an SSD model in another pipeline. At this stage, it is reached reasonable classification accuracy result without using perfectly shaped GTBB. Lastly, an application is implemented in a use case by using our proposed pipelines. It detects the unauthorized vehicles by comparing their license plate numbers and make & models. It is assumed that license plates are readable.

点赞 0
阅读1+

Existing attention mechanisms are trained to attend to individual items in a collection (the memory) with a predefined, fixed granularity, e.g., a word token or an image grid. We propose area attention: a way to attend to areas in the memory, where each area contains a group of items that are structurally adjacent, e.g., spatially for a 2D memory such as images, or temporally for a 1D memory such as natural language sentences. Importantly, the shape and the size of an area are dynamically determined via learning, which enables a model to attend to information with varying granularity. Area attention can easily work with existing model architectures such as multi-head attention for simultaneously attending to multiple areas in the memory. We evaluate area attention on two tasks: neural machine translation (both character and token-level) and image captioning, and improve upon strong (state-of-the-art) baselines in all the cases. These improvements are obtainable with a basic form of area attention that is parameter free.

点赞 0
阅读3+

We propose a person detector on omnidirectional images, an accurate method to generate minimal enclosing rectangles of persons. The basic idea is to adapt the qualitative detection performance of a convolutional neural network based method, namely YOLOv2 to fish-eye images. The design of our approach picks up the idea of a state-of-the-art object detector and highly overlapping areas of images with their regions of interests. This overlap reduces the number of false negatives. Based on the raw bounding boxes of the detector we fine-tuned overlapping bounding boxes by three approaches: the non-maximum suppression, the soft non-maximum suppression and the soft non-maximum suppression with Gaussian smoothing. The evaluation was done on the PIROPO database and an own annotated Flat dataset, supplemented with bounding boxes on omnidirectional images. We achieve an average precision of 64.4 % with YOLOv2 for the class person on PIROPO and 77.6 % on Flat. For this purpose we fine-tuned the soft non-maximum suppression with Gaussian smoothing.

点赞 0
阅读2+
Top