HACS Temporal Action Localization Challenge 2020

We will be hosting HACS Temporal Action Localization Challenge 2020 in the CVPR'20 International Challenge on Activity Recognition Workshop.

The goal of this challenge is to temporally localize actions in untrimmed videos. This year, we continue to have the classical fully-supervised learning track, while introducing a NEW track which explores weakly-supervised learning setting. While high quality action segment labels are expensive to obtain, weakly-supervised learning allows participants to exploit a much larger video corpus with extra action labels on short clips for improving the learning. Performance of these two tracks will be ranked separately.

For your reference, results of last year's HACS Challenge can be found at HACS Challenge 2019.

Challenge 1: Supervised Learning Track

For this track, participants will use HACS Segments, a video dataset carefully annotated with a complete set of temporal action segments for the temporal action localization task. Each video can contain multiple action segments. The task is to localize these action segments by predicting the start and end times of each action as well as the action label. Participants are allowed to leverage multi-modalities (e.g. audio/video). External datasets for pre-training are allowed, but it needs to be clearly documented. Training and testing will be performed on the following dataset:

HACS Segments ONLY

  • Temporal annotations on action segment type, start time, end time.
  • 200 action classes, nearly 140K action segments annotated in nearly 50K videos.
  • 37.6Ktraining videos, 6K validation videos, 6K testing videos.
  • * HACS Clips dataset is NOT permitted in this track. *

Challenge 2: Weakly-supervised Learning Track

For this track, participants are allowed to use two datasets for the temporal action localization task, namely HACS Clips and HACS Segments. HACS Segments contains videos densely annotated with temporal action segments, , while HACS Clips contains videos where only a sparse set of short video clips are annotated.. These two datasets share the same video source and taxonomy. Participants are encouraged to explore a weakly-supervised training procedure to learn action localization models. The following two dataset are allowed for model training, and testing will be performed on the test set of HACS Segments:

HACS Clips

  • 0.5M videos where 1.55M video clips of 2-second duration are sampled.
  • Video clips are annotated with either one label out of 200 action classes or background label.

HACS Segments

  • Temporal annotations on action segment type, start time, end time.
  • 200 action classes, 140K action segments annotated in nearly 50K videos.
  • 37.6Ktrainingvideos, 6K validation videos, 6K testing videos.

Data Download

Please follow instructions in THIS PAGE to download HACS Segments dataset.


Evaluation Metric

We use mAP as our evaluation metric, which is the same as ActivityNet localization metric.

Interpolated Average Precision (AP) is used as the metric for evaluating the results on each activity category. Then, the AP is averaged over all the activity categories (mAP). To determine if a detection is a true positive, we inspect the temporal intersection over union (tIoU) with a ground truth segment, and check whether or not it is greater or equal to a given threshold (e.g. tIoU > 0.5). The official metric used in this task is the average mAP, which is defined as the mean of all mAP values computed with tIoU thresholds between 0.5 and 0.95 (inclusive) with a step size of 0.05.


Submission

Submission portal will be available on April 13th.

Performance of BOTH tracks are evaluated on the test set of HACS Segments. You should submit a JSON file in the following format, where each video ID has a list of predicted action segments.

{
  "results": {
    "--0edUL8zmA": [
      {
        "label": "Dodgeball",
        "score": 0.84,
        "segment": [5.40, 11.60]
      },
      {
        "label": "Dodgeball",
        "score": 0.71,
        "segment": [12.60, 88.16]
      }
    ]
  }
}

Important Dates

  • March 1, 2020: Challenge is announced, Train/Val/Test sets are made available.
  • April 13, 2020: Evaluation server opened.
  • May 28, 2020: Evaluation server closed.
  • June 1, 2020: Deadline for submitting the report.
  • June 14, 2020: Full-day challenge workshop at CVPR 2020.

Please contact HangZhao AT csail.mit.edu for further questions.