HACS
Human Action Clips and Segments Dataset for Recognition and Temporal Localization
1.55M clips on 504K videos, 140K segments on 50K videos

About

This project introduces a novel video dataset, named HACS (Human Action Clips and Segments). It consists of two kinds of manual annotations. HACS Clips contains 1.55M 2-second clip annotations; HACS Segments has complete action segments (from action start to end) on 50K videos. The large-scale dataset is effective for pretraining action recognition and localization models, and also serves as a new benchmark for temporal action localization. (*SLAC dataset is now part of HACS dataset.)

Large-scale Dataset

HACS Clips includes:

  • 1.55M 2-second clips on 504K videos

HACS Segments includes:

  • 140K complete segments on 50K videos

Efficient Annotations

HACS annotation pipeline:

  • Automatic clip sampling
  • Efficient clip annotation
  • Accurate segment annotation

Benchmarking

  • HACS pretrained model improves performance on:
    Kinetics, UCF, HMDB
    THUMOS, ActivityNet
  • HACS Segments is a new action localization benchmark

HACS Clips

Click MORE to find more clip samples.

Each row shows the sampled clips from one video, their corresponding start and end times (start, end), and the annotations (Positive or Negative).

HACS Segments

Click MORE to find more segment samples.

Play the videos to check segment annotations, which are shown in the timelines below.

Download

Download our paper, dataset, code and pretrained models.

Paper

  • Paper released on arXiv.
  • Supplementary materials released.

Dataset

  • HACS v1.1 released.

Models

  • Coming soon.

If you find our work helpful, please cite the following paper:

          @article{zhao2019hacs,
            title={HACS: Human Action Clips and Segments Dataset for Recognition and Temporal Localization},
            author={Zhao, Hang and Yan, Zhicheng and Torresani, Lorenzo and Torralba, Antonio},
            journal={arXiv preprint arXiv:1712.09374},
            year={2019}
          }

Our Team

This work is a joint effort of several researchers.