We offer a benchmark suite together with an evaluation server. We offer a dataset that contains more than 25,000 pictures, including 15,403 images for training set, 5,000 images for validation set and 5,000 images for testing set.

Note: We only display results with relatively detailed descriptions.

Multi-Human Parsing


We use two human-centric metrics for multi-human parsing evaluation, which are initially reported by the MHP v1.0 paper. The two metrics are Average Precision based on part (APp) (%) and Percentage of Correctly parsed semantic Parts (PCP) (%).


All teams with successful submissions have a placeholder in the leaderboard, and the results of all teams will be released on 10 June. The winner of the challenge is the team with maximal numbers of top-1 ranking among all the five metrics (one in each columns). Ties are broken by the score of APp0.5.

Method APp0.5 APpvol PCP0.5 APP0.5(Inter20%) APP0.5(Inter10%) Submit Time
Baseline 25.14 41.78 32.25 18.61 14.88 2018-04-12 20:00:00
SKK-2 9.98 31.40 21.03 3.68 1.93 2018-05-09 14:04:00
S-LAB 31.47 40.71 38.27 17.86 10.33 2018-06-11 07:26:00
BJTU_UIUC 33.34 42.25 41.82 12.45 6.53 2018-06-10 04:56:00
UTP 16.44 34.38 28.39 5.51 2.69 2018-06-10 15:33:00