KINS: Amodal Instance Dataset
The authors annotated a total of 14,491 images from KITTI to form a large-scale amodal instance dataset. The dataset is split into two parts where 7,474 images are used for training and 7,517 for testing. The annotations include amodal instance masks, semantic labels, and relative occlusion order.
Dataset statistics. On average, each image has 12.53 labeled instances, and each object polygon consits of 33.70 points. Of all regions, 53.6% are partially occluded and the average occlusion ratio is 31.7%.
Semantic labels. General categories in KINS consits of 'people' and 'vehicle'. The general categories are further divided into:
- people (14.43%): pedestrian (10.56%), cyclist (2.69%), person-siting (1.18%)
- vehicle (85.57%): car (67.76%), tram (1.09%), truck (0.92%), van (5.93%), misc (9.87%)
Occlusion level. Compared with COCO Amodal Dataset, heavy occlusion is more common in KINS.
Examples of the amodal/inmodal masks. The digits indicate the relative occlusion order.
|