What is the Spatial Extent of an Object? (bibtex)
by J.R.R. Uijlings, A.W.M. Smeulders, R.J.H. Scha
Abstract:
This paper discusses the question: Can we improve the recognition of objects by using their spatial context? We start from Bag-of-Words models and use the Pascal 2007 dataset. We use the rough object bounding boxes that come with this dataset to investigate the fundamental gain con- text can bring. Our main contributions are: (I) The result of Zhang et al. in CVPR07 that context is superfluous de- rived from the Pascal 2005 data set of 4 classes does not generalize to this dataset. For our larger and more realistic dataset context is important indeed. (II) Using the rough bounding box to limit or extend the scope of an object dur- ing both training and testing, we find that the spatial extent of an object is determined by its category: (a) well-defined, rigid objects have the object itself as the preferred spatial extent. (b) Non-rigid objects have an unbounded spatial ex- tent: all spatial extents produce equally good results. (c) Objects primarily categorised based on their function have the whole image as their spatial extent. Finally, (III) using the rough bounding box to treat object and context sepa- rately, we find that the upper bound of improvement is 26% (12% absolute) in terms of Mean Average Precision, and this bound is likely to be higher if the localisation is done using segmentation. It is concluded that object localisation, if done sufficiently precise, helps considerably in the recog- nition of objects for the Pascal 2007 dataset.
Reference:
J.R.R. Uijlings, A.W.M. Smeulders, R.J.H. Scha, "What is the Spatial Extent of an Object?", In CVPR, 2009.
Bibtex Entry:
@INPROCEEDINGS{Uijlings09a,
  author = {J.R.R. Uijlings and A.W.M. Smeulders and R.J.H. Scha},
  title = {What is the Spatial Extent of an Object?},
  booktitle = {CVPR},
  year = {2009},
  abstract = {This paper discusses the question: Can we improve the
	
	recognition of objects by using their spatial context? We
	
	start from Bag-of-Words models and use the Pascal 2007
	
	dataset. We use the rough object bounding boxes that come
	
	with this dataset to investigate the fundamental gain con-
	
	text can bring. Our main contributions are: (I) The result
	
	of Zhang et al. in CVPR07 that context is superfluous de-
	
	rived from the Pascal 2005 data set of 4 classes does not
	
	generalize to this dataset. For our larger and more realistic
	
	dataset context is important indeed. (II) Using the rough
	
	bounding box to limit or extend the scope of an object dur-
	
	ing both training and testing, we find that the spatial extent
	
	of an object is determined by its category: (a) well-defined,
	
	rigid objects have the object itself as the preferred spatial
	
	extent. (b) Non-rigid objects have an unbounded spatial ex-
	
	tent: all spatial extents produce equally good results. (c)
	
	Objects primarily categorised based on their function have
	
	the whole image as their spatial extent. Finally, (III) using
	
	the rough bounding box to treat object and context sepa-
	
	rately, we find that the upper bound of improvement is 26%
	
	(12% absolute) in terms of Mean Average Precision, and
	
	this bound is likely to be higher if the localisation is done
	
	using segmentation. It is concluded that object localisation,
	
	if done sufficiently precise, helps considerably in the recog-
	
	nition of objects for the Pascal 2007 dataset.},
  doi = {10.1109/CVPR.2009.5206663},
  owner = {jrruijli},
  timestamp = {2009.05.20},
  url = {http://www.huppelen.nl/publications/spatialExtentCvpr.pdf}
}
Powered by bibtexbrowser