- Try the code
- Basic Idea
- Step-by-step Explanation
- Command Line
- Fit MANO to Monocular Videos
- Fit MANO to Fixed Hand Pose
Compared to multi-view data, single-view data has its unique advantages: it is easy to capture, does not require calibration, synchronization, or multiple cameras. Therefore, single-view data finds wide applications in practical scenarios. This document will guide you on how to process single-view data using EasyMocap.
Try the code
The example dataset can be download from 03_fitmono/internet-rotate.zip. After downloading, unzip it to the data/examples
folder.
data=data/examples/internet-rotate
emc --data config/datasets/svimage.yml --exp config/1v1p/hrnet_pare_finetune.yml --root ${data} --ranges 0 500 1 --subs 23EfsN7vEOA+003170+003670
The raw video is from Youtube.
Basic Idea
- Initialize: Use a network-based human pose estimation method to estimate the initial SMPL parameters.
- Optimization: Minimize the re-projection error and temporal smoothness error.
Step-by-step Explanation
dataset
Here, we define a dataset specifically for monocular data. It only requires reading individual images. The camera parameter is not required. We assume this is a static camera.
at_step
- We use YOLOv5 and HRNet to detect and estimate the human pose. A simple tracking method is used to track the detected human.
- We use
PARE
to initialize the SMPL parameters.
at_final
Once the estimation for each frame is completed, we combine the initial SMPL estimates from all frames with the 2D keypoints of all frames for joint optimization.
Command Line
emc --data config/datasets/svimage.yml --exp config/1v1p/hrnet_pare_finetune.yml --root ${data} --ranges 0 500 1 --subs 23EfsN7vEOA+003170+003670
We use --subs
to specify the selected video.
Fit MANO to Monocular Videos
Similarly, we can fit a MANO model to monocular videos, and this extension is straightforward.
The example dataset can be download from 03_fitmono/hand.zip. After downloading, unzip it to the data/examples
folder.
Body | Hand | |
---|---|---|
2D Keypoints | yolo+HRNet | 2D Hand Network |
Init | PARE | MANO Estimation Network |
More details can be found in config/1v1p/hand_detect_finetune.yml
.
Running the code:
data=data/examples/hand
emc --data config/datasets/svimage.yml --exp config/1v1p/hand_detect_finetune.yml --root ${data} --ranges 0 1800 1 --subs video
Video are captured using one iPhone.
Fit MANO to Fixed Hand Pose
Motivation: In scenarios where a hand is grasping an object without a CAD model, we can estimate the object’s pose by estimating the hand’s pose.
Insight: Even when the hand is occluded by the object, we can assume the hand’s pose remains stationary. By optimizing a globally consistent pose, we can reduce the ambiguity of monocular cues and minimize errors introduced by occlusion.
The example dataset can be download from 03_fitmono/hand_fix_example.zip. After downloading, unzip it to the data/examples
folder.
Code: We add this part into at_final
:
mean_param:
module: myeasymocap.operations.init.MeanShapes
key_from_data: [params]
args:
keys: ['poses', 'shapes'] # Mean the poses and shapes
Running the code:
data=data/examples/3_Giuliano
emc --data config/datasets/svimage.yml --exp config/1v1p/fixhand.yml --root ${data} --ranges 0 1800 1
Video are captured using one iPhone.