Task2: Multi-Pedestrian Tracking

Introduction

Object tracking aims to associate objects at different spatial positions and temporal frames. The superior properties of PANDA make it naturally suitable for long-term multi-object tracking. Yet the complex scenes with crowded pedestrian impose various challenges as well.

Given an input video sequence, this task requires the participating algorithms to recover the trajectories of pedestrians in the video (submit bounding-boxes with track id).

The challenge is based on PANDA-Video dataset which contains 15 video sequences including 10 videos for training and 5 videos for testing. We manually annotate the bounding boxes of pedestrians in each video frame. In addition, we also provide two kinds of useful annotations, i.e., occlusion degree and face orientation of each person. Please see the Download page for more details about annotation.

Results Format

Please submit your results as a single mot_results.zip file. The results for each sequence must be stored in a separate .txt file in the archive's root folder. The file name must be exactly like the sequence name (e.g. 01_University_Canteen.txt, case sensitive).

The format of the result file is the same as that of the MOTChallenge. The file format is a CSV text-file containing one object instance per line. Each line must contain 10 values:

<frame>, <id>, <bb_left>, <bb_top>, <bb_width>, <bb_height>, <conf>, <x>, <y>, <z>

For the ground truth, the conf value acts as a flag whether the entry is to be considered. A value of 0 means that this particular instance is ignored in the evaluation, while any other value can be used to mark it as active. For submitted results, all lines in the .txt file are considered. The world coordinates x,y,z are ignored for the this 2D challenge and can be filled with -1. However, each line is still required to contain 10 values.

All frame numbers, target IDs and bounding boxes are 1-based. Here is an example:

                               1, 3, 794.27, 247.59, 71.245, 174.88, -1, -1, -1, -1
                               1, 6, 1648.1, 119.61, 66.504, 163.24, -1, -1, -1, -1
                               1, 8, 875.49, 399.98, 95.303, 233.93, -1, -1, -1, -1
                               ...
                               

The meaning of each value is listed as follows:

Position Name Description
1 frame The identity of the frame
2 id The identity of the target which is used to provide the temporal corresponding relation of the bounding boxes in different frames
3 bb_left The x coordinate of the top-left corner of the predicted bounding box
4 bb_top The y coordinate of the top-left corner of the predicted object bounding box
5 bb_width The width in pixels of the predicted object bounding box
6 bb_height The height in pixels of the predicted object bounding box

Evaluation Metrics

To evaluate the performance of multiple pedestrian tracking algorithms, we adopt the metrics of MOTChallenge [2], including MOTA, MOTP, IDF1, FAR, MT and Hz. MOTA(Multiple Object Tracking Accuracy) computes the accuracy considering three error sources: false positives, false negatives/missed targets and identity switches. MOTP (Multiple Object Tracking Precision) takes into account the misalignment between the groundtruth and the predicted bounding boxes. IDF1 (ID F1 score) measures the ratio of correctly identified detections over the average number of ground-truth and computed detections. FAR (False alarm rate) measures the average number of false alarms per frame. MT (The mostly tracked targets) measures the ratio of ground-truth trajectories that are covered by a track hypothesis for at least 80% of their respective life span. The Hz indicates the processing speed of the algorithm. For all evaluation metrics, except FAR, higher is better. The evaluation code for Task 2 is available on the PANDA-Toolkit.

Baseline Results

Table: Performance of multiple object tracking methods on PANDA. T is tracker, D is detector, DS, DAN and MD denote the DeepSORT, DAN and MOTDT trackers, respectively. ↑ denotes higher is better and vice versa.

Data and Annotations

For PANDA-Video, all data and annotations for training set are available on the Download page.

Tools and Instructions

We provide extensive toolkit support for the PANDA in which APIs for data visualization, split, merge, and result evaluation are provided. Please visit our GitHub repository page. For addition questions, please find the answers in FAQ or contact us.

Top