We offer a benchmark suite together with an evaluation server, such that authors can upload their results and get a ranking. We offer a dataset that contains more than 25,000 pictures, including 15,403 images for training set, 5,000 images for validation set and 5,000 images for testing set. If you would like to submit your results, please follow the instructions on our submission page.

Note: We only display results with relatively detailed descriptions.

Multi-Human Parsing


We use two human-centric metrics for multi-human parsing evaluation, which are initially reported by the MHP v1.0 paper. The two metrics are Average Precision based on part (APp) (%) and Percentage of Correctly parsed semantic Parts (PCP) (%).


All teams with successful submissions have a placeholder in the leaderboard, and the results of all teams will be released on 10 June. The winner of the challenge is the team with maximal numbers of top-1 ranking among all the five metrics (one in each columns). Ties are broken by the score of APp0.5.

Method APp0.5 APpvol PCP0.5 APP0.5(Inter20%) APP0.5(Inter10%) Abbreviation Submit Time
Baseline 25.14 41.78 32.25 18.61 14.88 Abbreviation 2018-04-12 20:00:00
SKK-2 9.98 31.40 21.03 3.68 1.93 Abbreviation 2018-05-09 14:04:00
S-LAB 31.47 40.71 38.27 17.86 10.33 Abbreviation 2018-06-11 07:26:00
BJTU_UIUC 33.34 42.25 41.82 12.45 6.53 Abbreviation 2018-06-10 04:56:00
UTP 16.44 34.38 28.39 5.51 2.69 Abbreviation 2018-06-10 15:33:00