Yu Rong

I am currently a Research Engineer of (Meta) Reality Labs Research. Prior to that, I obtained the Ph.D. degree from The Chinese University of Hong Kong, Multimedia Laboratory, in September 2021, advised by Prof. Xiaoou Tang and Prof. Chen Change Loy. I received my B.E from computer science and technology department of Tsinghua University in 2016.

I interned in Facebook AI Research (FAIR) with Dr. Hanbyul Joo and Dr. Takaaki Shiratori in 2020 Spring. Earlier, I worked as a computer vision researcher in Sensetime from Apr. 2015 to Aug. 2017.

My research interests include computer vision, computer graphics, and machine learning. Especially, I am interested in virtual humans, including 3D face, hand, and body reconstruction.

Email / Github / Google Scholar / LinkedIn / CV

News

NEW! One paper accepted to CVPR 2022! Coming Soon!
The code of IHMR (3DV 2021) has been released. Please check GitHub for more details.
Our paper, IHMR, is accepted to 3DV 2021.

Industry Experience

Mar. 2022 - Now, (Meta) Reality Labs Research
Research Engineer. Pittsburgh, PA, U.S.

Currently, I am working in (Meta) Reality Labs Research to build 3D Face Avatar for AR/VR equipments.

Jan. 2020 - May. 2020, Facebook AI Research (FAIR)
Research Intern. Menlo Park, CA, U.S.

We use SMPL-X to represent 3D hands and bodies and adopt separate modules for predicting independent hand and body motion first. Hand and body motion predictions are then combined and finetuned to get unified hand and body motion results. Our model runs 10x faster than previous methods with better performance on challenging in-the-wild scenarios with motion blur.

Apr. 2015 - Jul. 2017, SenseTime Research
Computer Vision Researcher. Beijing, China.

During my working in sensetime, I majorly took part in research projects in OCR and face recognition. These projects including but not limited to: ID card recognition (Given an image of ID card, recognize important information like name, address); text detection for in-the-wild scenes (using modified RPN); a remote training system which can train and evaluate automatically to produce a face verification model.

Education

July. 2017 - September 2021, The Chinese Unviersity of Hong Kong
Department of Information Engineering
Doctor of Philosophy

Aug. 2012 - Jul. 2016 , Tsinghua University
Department of Computer Science and Technology,
Balchlor of Engineering

Selected Publications [Full Publication List]

Towards Diverse and Natural Scene-aware 3D Human Motion Synthesis
Jingbo Wang, Yu Rong, Jingyuan Liu, Sijie Yan, Dahua Lin, Bo Dai
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[PDF] [Demo]

We present a multi-stage framework to synthesize natural and diverse human motions interacting with given scenes under the guidance of action labels.

Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements
Yu Rong, Jingbo Wang, Ziwei Liu, Chen Change Loy
International Conference on 3D Vision (3DV), 2021
[PDF] [Project Page] [Code] [Demo]

We present a two-stage framework for reconstructing collision-aware 3D interacting hands from monocular single images. The first stage uses a CNN to generate initial 3D hands and 2D/3D joints. The second stage refines initial results to diminish collisions via factorized refinement.

VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh Recovery from Partial Point Clouds
Guanze Liu, Yu Rong, Lu Sheng
ACM Multimedia (MM), 2021, Oral
[PDF] [Code] [Demo]

We deisgn a framework, named VoteHMR, to reconstruct reliable 3D human pose and shapes from single-frame partial point clouds obtained from commercial depth sensors such as Kinect.

FrankMocap: A Monocular 3D Whole-Body Pose Estimation System via Regression and Integration
Yu Rong, Takaaki Shiratori, Hanbyul Joo
International Conference on Computer Vision Workshops (ICCVW), 2021
[PDF] [Project Page] [Code] [Demo]

We present a framework, named FrankMocap, to simultaneously capture 3D whole-body motioin (body, face, and hands) from monocular RGB inputs.

Chasing the Tail in Monocular 3D Human Reconstruction with Prototype Memory
Yu Rong, Ziwei Liu, Chen Change Loy
IEEE Transactions on Image Processing (TIP), 2022
[PDF] [Project Page] [Code]

We design a novel framework to increase the 3D human mocap accuracy for challenging poses.

Delving Deep into Hybrid Annotations for 3D Human Recovery in the Wild
Yu Rong, Ziwei Liu, Cheng Li, Kaidi Cao, Chen Change Loy
International Conference on Computer Vision (ICCV), 2019
[PDF] [Project Page] [Code]

We provided sufficient investigation of annotation design for in-the-wild 3D human reconstruction.

Pose-Robust Face Recognition via Deep Residual Equivariant Mapping
Kaidi Cao*, Yu Rong*, Cheng Li, Xiaoou Tang, Chen Change Loy
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[PDF] [Project Page] [Code]

We presented a Deep Residual EquivAriant Mapping (DREAM) block to improve the performance of face recognition on profile faces.

Academic Activities

Co-organize a workshop on human sensing in computer vision at ICCV 2019.
Serve as reviewer for CVPR 2019~2022, ICCV 2019~2023, ECCV 2018~2022, ICLR 2022, and AAAI 2020/2021.
Serve as reviewer for TPAMI, IJCV, and TIP.