We hosted HACS Temporal Action Localization Challenge at ICCV'19 Workshop on Multi-modal Video Analysis. The goal of this challenge is to detect actions in untrimmed videos. The action localization challenge uses HACS Segments dataset, which contains:
In this year's challenge, we have recieved 21 submissions from 5 teams. Winner teams' performance and reports can be found below.
Please follow instructions in THIS PAGE to download HACS Segments dataset.
We use mAP as our evaluation metric, which is the same as ActivityNet localization metric.
Interpolated Average Precision (AP) is used as the metric for evaluating the results on each activity category. Then, the AP is averaged over all the activity categories (mAP). To determine if a detection is a true positive, we inspect the temporal intersection over union (tIoU) with a ground truth segment, and check whether or not it is greater or equal to a given threshold (e.g. tIoU > 0.5). The official metric used in this task is the average mAP, which is defined as the mean of all mAP values computed with tIoU thresholds between 0.5 and 0.95 (inclusive) with a step size of 0.05.
Submission portal is now closed, stay tuned for our next year's challenge.
You should submit a JSON file in the following format, where each video ID has a list of predicted action segments. Submission portal will be available on August 1st.
{ "results": { "--0edUL8zmA": [ { "label": "Dodgeball", "score": 0.84, "segment": [5.40, 11.60] }, { "label": "Dodgeball", "score": 0.71, "segment": [12.60, 88.16] } ] } }
Please contact HangZhao AT csail.mit.edu for further questions.