e2e

Papers with tag e2e

2022

Learnable human mesh triangulation for 3D human pose and shape estimation

Sungho Chun, Sungbum Park, and Ju Yong Chang

In 2022

Abs Qing Bib HTML

Compared to joint position, the accuracy of joint rotation and shapeestimation has received relatively little attention in the skinned multi-personlinear model (SMPL)-based human mesh reconstruction from multi-view images. Thework in this field is broadly classified into two categories. The firstapproach performs joint estimation and then produces SMPL parameters by fittingSMPL to resultant joints. The second approach regresses SMPL parametersdirectly from the input images through a convolutional neural network(CNN)-based model. However, these approaches suffer from the lack ofinformation for resolving the ambiguity of joint rotation and shapereconstruction and the difficulty of network learning. To solve theaforementioned problems, we propose a two-stage method. The proposed methodfirst estimates the coordinates of mesh vertices through a CNN-based model frominput images, and acquires SMPL parameters by fitting the SMPL model to theestimated vertices. Estimated mesh vertices provide sufficient information fordetermining joint rotation and shape, and are easier to learn than SMPLparameters. According to experiments using Human3.6M and MPI-INF-3DHP datasets,the proposed method significantly outperforms the previous works in terms ofjoint rotation and shape estimation, and achieves competitive performance interms of joint location estimation.

每个视角进行可见性判断再进行特征融合，最后接了拟合模块
@inproceedings{lrmesh, title = {Learnable human mesh triangulation for 3D human pose and shape estimation}, author = {Chun, Sungho and Park, Sungbum and Chang, Ju Yong}, year = {2022}, tags = {human, mv, 1p, SMPL, e2e}, }
human mv 1p SMPL e2e
On Triangulation as a Form of Self-Supervision for 3D Human Pose Estimation

Soumava Kumar Roy, Leonardo Citraro, Sina Honari, and Pascal Fua

In 2022

Abs Qing Bib HTML

Supervised approaches to 3D pose estimation from single images are remarkablyeffective when labeled data is abundant. However, as the acquisition ofground-truth 3D labels is labor intensive and time consuming, recent attentionhas shifted towards semi- and weakly-supervised learning. Generating aneffective form of supervision with little annotations still poses majorchallenge in crowded scenes. In this paper we propose to impose multi-viewgeometrical constraints by means of a weighted differentiable triangulation anduse it as a form of self-supervision when no labels are available. We thereforetrain a 2D pose estimator in such a way that its predictions correspond to there-projection of the triangulated 3D pose and train an auxiliary network onthem to produce the final 3D poses. We complement the triangulation with aweighting mechanism that alleviates the impact of noisy predictions caused byself-occlusion or occlusion from other subjects. We demonstrate theeffectiveness of our semi-supervised approach on Human3.6M and MPI-INF-3DHPdatasets, as well as on a new multi-view multi-person dataset that featuresocclusion.

使用多视角三角化来自监督
@inproceedings{2203.15865, title = {On Triangulation as a Form of Self-Supervision for 3D Human Pose Estimation}, author = {Roy, Soumava Kumar and Citraro, Leonardo and Honari, Sina and Fua, Pascal}, year = {2022}, tags = {human, mv, 1p, 3dpose, e2e, self-supervised}, }
human mv 1p 3dpose e2e self-supervised

2021

2020

2019

Learnable Triangulation of Human Pose

Karim Iskakov, Egor Burkov, Victor Lempitsky, and Yury Malkov

In 2019

Abs Qing Bib HTML Code

We present two novel solutions for multi-view 3D human pose estimation basedon new learnable triangulation methods that combine 3D information frommultiple 2D views. The first (baseline) solution is a basic differentiablealgebraic triangulation with an addition of confidence weights estimated fromthe input images. The second solution is based on a novel method of volumetricaggregation from intermediate 2D backbone feature maps. The aggregated volumeis then refined via 3D convolutions that produce final 3D joint heatmaps andallow modelling a human pose prior. Crucially, both approaches are end-to-enddifferentiable, which allows us to directly optimize the target metric. Wedemonstrate transferability of the solutions across datasets and considerablyimprove the multi-view state of the art on the Human3.6M dataset. Videodemonstration, annotations and additional materials will be posted on ourproject page (https://saic-violet.github.io/learnable-triangulation).

多个视角的特征反投影到3D空间中通过3D网络获得最终输出
@inproceedings{lrtri, title = {Learnable Triangulation of Human Pose}, author = {Iskakov, Karim and Burkov, Egor and Lempitsky, Victor and Malkov, Yury}, year = {2019}, tags = {human, mv, 1p, 3dpose, e2e}, }
human mv 1p 3dpose e2e

e2e

2022

2021

2020

2019

2018

2017