TUHOI is a Human Object Interaction dataset containing more than 10
thousand images, which have been annotated with more than 2,9 thousand
actions.
It was built based on the images collected in ImageNet for the large scale object recognition challenge 2013 - the training and validation of the Detection dataset.
Download the annotation here.
Download the images here
(see Dataset 1: Detection).
Citation:
Dieu-Thu Le, Jasper Uijlings, Raffaella Bernardi, "TUHOI: The Universal
Human Object Interaction Dataset", COLING'14 workshop on Vision and
Language (VL'14), Dublin, Ireland, 2014