Real-world Large-scale Scenes


Pixels Per Frame


Fine-grained Attribute Labels


Bounding Boxes


PANDA Dataset Introduction

PANDA is the first gigaPixel-level humAN-centric viDeo dAtaset, for large-scale, long-term, and multi-object visual analysis. The videos in PANDA were captured by a gigapixel camera and cover real-world large-scale scenes with both wide field-of-view (~1km^2 area) and high resolution details (~gigapixel-level/frame). The scenes may contain 4k head counts with over 100× scale variation. PANDA provides enriched and hierarchical ground-truth annotations, including 15,974.6k bounding boxes, 111.8k fine-grained attribute labels, 12.7k trajectories, 2.2k groups and 2.9k interactions.

If you use our dataset, please cite the following paper: [PDF]

@inproceedings{wang2020panda, title={PANDA: A Gigapixel-level Human-centric Video Dataset}, author={Wang, Xueyang and Zhang, Xiya and Zhu, Yinheng and Guo, Yuchen and Yuan, Xiaoyun and Xiang, Liuyu and Wang, Zerun and Ding, Guiguang and Brady, David J and Dai, Qionghai and Fang, Lu}, booktitle={Computer Vision and Pattern Recognition (CVPR), 2020 IEEE International Conference on}, year={2020}, organization={IEEE} }

Live Demo of PANDA Dataset

Professors in Our Team

Lu Fang

Tsinghua University

Qionghai Dai

Tsinghua University

Guiguang Ding

Tsinghua University

David J. Brady

University of Arizona